-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPS: Add warning if pyarrow is not installed #56896
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, added some minor comments/suggestions.
pandas/__init__.py
Outdated
# Don't allow users to use pandas.os | ||
del os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to not run this this at the end with the other modules? Seems like it's easier to maintain if there is all the clean up at the end or next to the import, but no big deal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not really.
Personally I like to keep the del closed to where it's being used, but no strong preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think os
is not being used in the code before beibg closed. As said, not important, but to me personally it makes sense to open and close around the block where it is being used, or open and close at the beginning and the end if it's an import that is used in several places. Now os
is being opened at the beginning, and closed in the middle, in a kind of random place if I'm not wrong. No huge difference, but I think it'll help maintain the code if closed at the end (or we move the import and del around where it is being used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put the os delete with the deletion of warnings/pa_version_under10p1.
pandas/__init__.py
Outdated
from pandas.compat._optional import VERSIONS | ||
|
||
try: | ||
import pyarrow # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add an else block to del pyarrow
so pandas.pyarrow
is not created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx
also cc @pandas-dev/pandas-core for feedback on the warning message itself. |
pandas/__init__.py
Outdated
|
||
warnings.warn( | ||
f""" | ||
Pyarrow will become a required dependency of pandas in the next major release of pandas, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pyarrow will become a required dependency of pandas in the next major release of pandas, | |
Pyarrow will become a required dependency of pandas in the next major release (pandas 3.0), |
I think we can be explicit about what this next major release will be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the next major release also the next minor release, i.e., 2.2 then 3.0? I noticed that the dev wheels recently switched to reporting 3.0.0.dev0
recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the next major release also the next minor release, i.e., 2.2 then 3.0? I noticed that the dev wheels recently switched to reporting
3.0.0.dev0
recently.
No, 2.2 is coming out sometime this week.
But yes, 3.0.0 will be the release after that.
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466 | ||
""", | ||
DeprecationWarning, | ||
stacklevel=2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to make this stacklevel=1?
Because this code is top-level and not inside a function or method, it one less than the typical stacklevel of 2 we use in a direct function/method.
Setting it as 1 would avoid that it triggers a warning when you call a function that has an inline import pandas
(not sure how important this is, though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning doesn't pop up in the REPL when I import pandas if I do stacklevel=1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, OK forget this. I did test it, but should have done something wrong because now I can't reproduce getting the warning with plain import and stacklevel 1
pandas/__init__.py
Outdated
warnings.warn( | ||
f""" | ||
Pyarrow will become a required dependency of pandas in the next major release of pandas, | ||
(to allow more performant data types and better interoperability with other libraries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would maybe explicitly mention the new default string dtype, since that's still the only one that is used by default that requires pyarrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with it. Will let others make the final decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice @lithomas1, looks perfect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like ci/deps/actions-311-numpydev.yaml
looks like it needs pyarrow otherwise LGTM
…o add-pyarrow-warning
Merging for now, happy to address any followups in another PR. |
Sorry to comment here rather than an issue but this feels like quite a niche gotcha -- just a heads up that the warning added here starts with a newline which makes it annoying to filter out (as e.g., neither
nor
nor
will match the message, and what is required is
I guess changing it now might cause more pain, but I suspect you may get a few more questions on this before v3 (unless I am singularly confused by this -- a quick GH search shows a few PRs that also took a while to get this correct in other packages)! |
…)" This reverts commit 5c2a407
* Revert "DEPS: Add warning if pyarrow is not installed (#56896)" This reverts commit 5c2a407 * Add whatsnew * Update * Update doc/source/whatsnew/v2.2.1.rst Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> --------- Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
* Revert "DEPS: Add warning if pyarrow is not installed (pandas-dev#56896)" This reverts commit 5c2a407 * Add whatsnew * Update * Update doc/source/whatsnew/v2.2.1.rst Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> --------- Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> (cherry picked from commit 4ed67ac)
* add warning * DEPS: Add warning if pyarrow is not installed * formatting * Update __init__.py * adjustments * adjustments * updates * address code review * Update __init__.py * add pyarrow to npdev build * ignore non numpy related deprecationwarnings/futurewarning * ignore non numpy related deprecationwarnings/futurewarning * Update actions-311-numpydev.yaml
* Revert "DEPS: Add warning if pyarrow is not installed (pandas-dev#56896)" This reverts commit 5c2a407 * Add whatsnew * Update * Update doc/source/whatsnew/v2.2.1.rst Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> --------- Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.~~Not sure how to add a test for this - importing our warning asserters would cause the warning the be raised! 🤣 ~~
EDIT: Nvm, figured it out.
(I've checked by hand that the warning is both raised when you import pandas directly and import a submodule of pandas e.g.
pandas.arrays
.I also double checked that the warning only occurs once and not on every import.)