Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Test on Cython 3.0 on numpydev #46029

Merged
merged 35 commits into from
Mar 28, 2022

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented Feb 17, 2022

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This is what we used to do before the libreduction saga. Given that, we finally killed libreduction, I'm bring this back so we can gauge our Cython 3 readiness.

@lithomas1 lithomas1 added CI Continuous Integration Dependencies Required and optional dependencies labels Feb 17, 2022
@lithomas1
Copy link
Member Author

Hmmm. Looks like hitting cython/cython#2056. Will investigate later.

@lithomas1 lithomas1 marked this pull request as draft February 17, 2022 15:08
@lithomas1
Copy link
Member Author

lithomas1 commented Feb 24, 2022

Hm. I got further this time but the compiled cython module is segfaulting. This is the output from lldb.
Running
lldb -- python -m pytest pandas/tests/groupby/test_quantile.py -v -k "test_quantile_out_of_bounds_q_raises" (and run in the lldb shell)
I see

`` frame #5: 0x0000000119ceba48 groupby.cpython-38-darwin.so`__pyx_fatalerror(fmt=) at groupby.c:110053:5 [opt] frame #6: 0x0000000119d6b977 groupby.cpython-38-darwin.so`__pyx_fuse_3__pyx_pw_6pandas_5_libs_7groupby_107group_quantile [inlined] __Pyx_XCLEAR_MEMVIEW(memslice=, have_gil=1, lineno=40963) at groupby.c:110120:9 [opt] frame #7: 0x0000000119d6b965 groupby.cpython-38-darwin.so`__pyx_fuse_3__pyx_pw_6pandas_5_libs_7groupby_107group_quantile [inlined] __pyx_pf_6pandas_5_libs_7groupby_106group_quantile(__pyx_self=, __pyx_v_out=, __pyx_v_values=0x0000000000000000, __pyx_v_labels=0x0000000000000000, __pyx_v_mask=0x0000000000000000, __pyx_v_sort_indexer=, __pyx_v_qs=__Pyx_memviewslice @ 0x0000600002f57560, __pyx_v_interpolation=0x0000000000000000) at groupby.c:40963 [opt] frame #8: 0x0000000119d6b657 groupby.cpython-38-darwin.so`__pyx_fuse_3__pyx_pw_6pandas_5_libs_7groupby_107group_quantile(__pyx_self=, __pyx_args=, __pyx_kwds=) at groupby.c:39838 [opt] frame #9: 0x00000001088c80dd interval.cpython-38-darwin.so`__pyx_FusedFunction_call [inlined] __Pyx_CyFunction_Call(func=, arg=, kw=) at interval.c:112155:12 [opt] frame #10: 0x00000001088c80ca interval.cpython-38-darwin.so`__pyx_FusedFunction_call [inlined] __pyx_FusedFunction_callfunction(func=0x0000000119b119e0, args=, kw=) at interval.c:112651 [opt] frame #11: 0x00000001088c80b0 interval.cpython-38-darwin.so`__pyx_FusedFunction_call(func=0x0000000119b119e0, args=0x0000000136b331c0, kw=0x0000000136b59800) at interval.c:112706 [opt] ``
in the backtrace. typing ``f 7``(for frame 7), i see

groupby.cpython-38-darwin.so was compiled with optimization - stepping may behave oddly; variables may not be available.
frame #7: 0x0000000119d6b965 groupby.cpython-38-darwin.so`__pyx_fuse_3__pyx_pw_6pandas_5_libs_7groupby_107group_quantile [inlined] __pyx_pf_6pandas_5_libs_7groupby_106group_quantile(__pyx_self=, __pyx_v_out=, __pyx_v_values=0x0000000000000000, __pyx_v_labels=0x0000000000000000, __pyx_v_mask=0x0000000000000000, __pyx_v_sort_indexer=, __pyx_v_qs=__Pyx_memviewslice @ 0x0000600002f74490, __pyx_v_interpolation=0x0000000000000000) at groupby.c:40963 [opt]
40960 __Pyx_XDECREF(__pyx_v_inter_methods);
40961 __Pyx_XDECREF(__pyx_gb_6pandas_5_libs_7groupby_14group_quantile_11generator3);
40962 __PYX_XCLEAR_MEMVIEW(&__pyx_v_sort_indexer, 1);
-> 40963 __PYX_XCLEAR_MEMVIEW(&__pyx_cur_scope->__pyx_v_qs, 1);
40964 __Pyx_DECREF((PyObject *)__pyx_cur_scope);
40965 __Pyx_XGIVEREF(__pyx_r);
40966 __Pyx_RefNannyFinishContext();

It looks like something is going wrong in Cython. For reference, this is the Cython code that seems to be triggering the segfault.

    if any(not (0 <= q <= 1) for q in qs):
        wrong = [x for x in qs if not (0 <= x <= 1)][0]
        raise ValueError(
            f"Each 'q' must be between 0 and 1. Got '{wrong}' instead"
        )

where qs is defined as const float64_t[:] qs.

cc @jbrockmendel for help on this one.

@jbrockmendel
Copy link
Member

no bright ideas here, maybe @da-woods can make something of this

@da-woods
Copy link

da-woods commented Feb 25, 2022

I'll have a closer look later.

Do you activate refnanny yourself? It should be turned off on most user code (it's really only an internal testing mechanism), but it's possible that Pandas enables it for their own test suit?

The reason I ask is that __Pyx_RefNannyFinishContext should be a no-op without refnanny so hopefully shouldn't be able to crash

Ignore this... It's not the refnanny line that's producing the error

@da-woods
Copy link

I can reproduce it with

# testpandasbug.pyx
def f(const double[:] qs):
    if any(not (0 <= q <= 1) for q in qs):
        wrong = [x for x in qs if not (0 <= x <= 1)][0]
        raise ValueError(
            f"Each 'q' must be between 0 and 1. Got '{wrong}' instead"
        )

build it with cythonize -if testpandasbug.pyx

and run with:

>>> import testpandasbug
>>> import numpy as np
>>> testpandasbug.f(np.array([1.,2.,3.0]))
Fatal Python error: Acquisition count is -1 (line 3824)
Python runtime state: initialized
Traceback (most recent call last):
  File "testpandasbug.pyx", line 4, in testpandasbug.f
    raise ValueError(
ValueError: Each 'q' must be between 0 and 1. Got '2.0' instead
Aborted (core dumped)

So it definitely looks like a Cython bug and one that's fairly easy to reproduce

@da-woods
Copy link

The next lot of failures also looks to be a Cython bug I think cython/cython#4668

Copy link
Member Author

@lithomas1 lithomas1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little closer to green now. Last failing test is a pickle test complaining about mismatched checksums. I'm not sure how to fix that one.

@@ -414,6 +414,15 @@ cdef class Interval(IntervalMixin):
return Interval(y.left + self, y.right + self, closed=y.closed)
return NotImplemented

def __radd__(self, y):
if (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of doing these checks(probably has an adverse effect on performance). Unfortunately, Cython gets stuck in a infinite recursion look from one binop to the reverse binop, when the reverse binop calls the regular binop and the regular binop raises NotImplemented.

Not sure if this is intended by Cython or a bug, but this works around it.
(The code path is not hit for Cython's < 3.0, so I think its fine to land as is and open an issue as a followup).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if you implement infinite recursion, then you get infinite recursion. No surprise here.

Consider extracting a separate function for the actual implementation instead, and make both special methods call that, in their own way. Then you can catch the "can't handle that" cases in one place, and otherwise run through the addition code and be done, rather than risking infinite recursion in the first place.

pandas/_libs/tslibs/timestamps.pyx Outdated Show resolved Hide resolved
@@ -1718,6 +1719,11 @@ cdef class _Period(PeriodMixin):

return NotImplemented

def __radd__(self, other):
if other is NaT:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks harmless, but is it necessary? would falling through to __add__ work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think falling through works. (I vaguely remember some tests failing because of this)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you double-check and add a comment for when future-me thinks this looks weird and doesn't remember this thread

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Looks like this might work? Checking with CI.

return type(other)(self) - other

return NotImplemented

def __rsub__(self, other):
if PyDateTime_Check(other):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any problem with return -(self - other)? i guess thats the overflow in the stata tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually copy-pasted from the sub code block above(including the contents). This actually shows up in other tests in addition to the stata ones(which is why I needed to copy-paste it).

return -(self - other) might work, but I think the tests would still fail, as the expected behavior there is to return a datetime.timedelta.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Took me a few tries to find a relevant example:

pd.Timestamp(2261, 1, 1).to_pydatetime() - pd.Timestamp(1677, 9, 30)

returns a pytimedelta. There's a small hiccup (doesn't matter for this PR, but i want to write it down before I forget it) if the Timestamp has nanos:

pd.Timestamp(2261, 1, 1).to_pydatetime() - pd.Timestamp(1677, 9, 30, nanosecond=4)

We lose the nanoseconds. In principle we could call to_pydatetime() and get a warning if appropropriate.

@jbrockmendel
Copy link
Member

Couple of nitpicks, no serious objections. I share everyone else's discomfort with the sub/rsub duplication pattern.

Will adding this to the CI make things easier for the cython folks?

@lithomas1
Copy link
Member Author

Couple of nitpicks, no serious objections. I share everyone else's discomfort with the sub/rsub duplication pattern.

Will adding this to the CI make things easier for the cython folks?

I would prefer if this is merged soon, and remaining comments are taken care of in followups. I'm finding it difficult to keep this up to date with main.

@da-woods
Copy link

da-woods commented Mar 28, 2022

I suspect it'd be useful for Cython if this was merged (once everyone's happy with it, of course). It did catch a number of legitimate bugs (so presumably might catch more in future...)

@jbrockmendel
Copy link
Member

OK. I think the only other people/person regularly touching this code is me, so the added maintenance burden pre-cy3 won't be that big a deal. I'm conceptually on board.

@@ -16,7 +16,8 @@ dependencies:
- pytz
- pip
- pip:
- cython==0.29.24 # GH#34014
#- cython # TODO: don't install from master after Cython 3.0.0a11 is released
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob should make an issue for this

@jreback jreback merged commit e882b72 into pandas-dev:main Mar 28, 2022
@jreback
Copy link
Contributor

jreback commented Mar 28, 2022

thanks @lithomas1 very nice. if you can make an issue for the noted comment.

@lithomas1 lithomas1 deleted the cython-alpha-testing branch March 29, 2022 00:19
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Dependencies Required and optional dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants