-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
join_asof out-of-order error for big sorted tables #41706
Comments
I was suspecting if pre-joining with by-key may cause the on-key reordering? As discussed here: https://lists.apache.org/list?user@arrow.apache.org:2024-4:join |
I have a similar issue with a smaller table. It only happens if I have a lot of small chunks in the table. Here's an example:
it took a while to make a reproducible example. I can't exactly pin down what is causing the issue. |
Describe the usage question you have. Please include as many useful details as possible.
With pyarrow 16.0.0, I can't apply join_asof although the input tables are ordered by "on" key.
Noticed when trying to merge bigger sorted tables - for example, it fails for tables with rows numbers 1061753 & 994046, but can be executed if I reduce numbers to 1048178 & 975257.
I think this behavior can be reproduced with an example below:
So I suspect the issue has nothing to do with the on-key values order, but rather the input size?
Is it the bug that can be fixed or some fundamental limitation?
Is there any workaround other than limiting input size?
Component(s)
Python
The text was updated successfully, but these errors were encountered: