You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Substrait ExtendedExpressions can be used with PyArrow by converting them to compute.Expression.
That way they can be used in the context of projections and filters when interacting with a Dataset or Table.
Some pieces are still missing for those to be used in practice and the user experience is generally too complex to be convenient. This issue is meant to track work that can be done to improve the Substrait experience in PyArrow
Allow accepting Substrait Message objects directly instead of bytes(currently pc.Expression.from_substrait(projection.SerializeToString()) dance is required which is not very convenient)
Allow accepting substrait messages directly where a pc.Expression is accepted instead of having to build the expression from the message
Have a way to encode PyArrow schemas to Substrait NamedStruct
Accept projections as a single substrait ExtendedExpression instead of having to build multiple different expressions for each projected column.
Component(s)
Python
The text was updated successfully, but these errors were encountered:
amol-
changed the title
Improve PyArrow support for Substrait ExtendedExpressions
[Python] Improve PyArrow support for Substrait ExtendedExpressions
May 16, 2024
Describe the enhancement requested
Substrait ExtendedExpressions can be used with PyArrow by converting them to
compute.Expression
.That way they can be used in the context of projections and filters when interacting with a Dataset or Table.
Some pieces are still missing for those to be used in practice and the user experience is generally too complex to be convenient. This issue is meant to track work that can be done to improve the Substrait experience in PyArrow
pc.Expression.from_substrait(projection.SerializeToString())
dance is required which is not very convenient)pc.Expression
is accepted instead of having to build the expression from the messageNamedStruct
ExtendedExpression
instead of having to build multiple different expressions for each projected column.Component(s)
Python
The text was updated successfully, but these errors were encountered: