Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use date32 for pyarrow date #52332

Merged
merged 2 commits into from
Apr 1, 2023
Merged

Use date32 for pyarrow date #52332

merged 2 commits into from
Apr 1, 2023

Conversation

alippai
Copy link
Contributor

@alippai alippai commented Mar 31, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this necessarily should unequivocally be date32. There may be a use case for storing with increased precision.

@alippai
Copy link
Contributor Author

alippai commented Mar 31, 2023

@mroeschke the precision is always the same, it's unexpected to store non-fullday value in pa.date64 and date64 is always milliseconds: https://lists.apache.org/thread/q036r1q3cw5ysn3zkpvljx3s9ho18419

The only gain of using that type is the cheaper conversion for frameworks / languages using millisecond based timestamps, but it's slower and uses more memory.

@alippai alippai requested a review from mroeschke April 1, 2023 16:41
@mroeschke mroeschke added datetime.date stdlib datetime.date support Arrow pyarrow functionality labels Apr 1, 2023
@mroeschke mroeschke added this to the 2.0 milestone Apr 1, 2023
@mroeschke mroeschke merged commit 4d1c994 into pandas-dev:main Apr 1, 2023
39 checks passed
@lumberbot-app
Copy link

lumberbot-app bot commented Apr 1, 2023

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 2.0.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 4d1c994da1e9815c87db2cd48a3286b077f2f7fe
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #52332: Use date32 for pyarrow date'
  1. Push to a named branch:
git push YOURFORK 2.0.x:auto-backport-of-pr-52332-on-2.0.x
  1. Create a PR against branch 2.0.x, I would have named this PR:

"Backport PR #52332 on branch 2.0.x (Use date32 for pyarrow date)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

@mroeschke
Copy link
Member

Thanks for the explanation @alippai

mroeschke pushed a commit to mroeschke/pandas that referenced this pull request Apr 1, 2023
@alippai
Copy link
Contributor Author

alippai commented Apr 1, 2023

Thanks for the swift merge and backport @mroeschke

mroeschke added a commit that referenced this pull request Apr 1, 2023
)

Backport PR #52332: Use date32 for pyarrow date

Co-authored-by: Ádám Lippai <adam@rigo.sk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality datetime.date stdlib datetime.date support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants