Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: datafusion 34, arrow & parquet 49 #1983

Closed
wants to merge 10 commits into from

Conversation

emcake
Copy link
Contributor

@emcake emcake commented Dec 19, 2023

Description

Upgrades DataFusion and Arrow to v34 and v49 respectively. As a casualty we also need to upgrade object_store to v0.8.

Related Issue(s)

Arrow 49 contains a fix to allow for truncated statistics on binary columns (apache/arrow-rs#5037) that I'd like to employ to fix #1805 .

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate crate/core labels Dec 19, 2023
arrow-select = { version = "49.0.0" }
parquet = { version = "49.0.0" }

object_store = "0.8"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved object_store to a workspace dependency as it's referenced in two separate crates - easier to keep the dependency in sync this way.

@roeap
Copy link
Collaborator

roeap commented Dec 19, 2023

@emcake - thanks for taking care of this!

Seems there is one more clippy error to address :).

@emcake
Copy link
Contributor Author

emcake commented Dec 20, 2023

I think this is now complete and up to date - there still appears to be some tests failing but I'm struggling to work out if these are flaky or real.

@ion-elgreco
Copy link
Collaborator

@emcake the restore by datetime and a read test are super flaky..

@ion-elgreco
Copy link
Collaborator

@emcake maybe object store 0.8 is causing the read failures, can you revert to object store 0.7?

@roeap
Copy link
Collaborator

roeap commented Dec 23, 2023

Object store is coupled to datafusion. Had a first glance, and it seems like fixable problem, but may require some work.

@roeap roeap mentioned this pull request Jan 4, 2024
@emcake
Copy link
Contributor Author

emcake commented Jan 4, 2024

Looks like @roeap beat me to it!

@emcake emcake closed this Jan 4, 2024
@emcake emcake deleted the df-34-arrow-49 branch January 4, 2024 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate crate/core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Delta Stats for binary columns are not truncated
3 participants