-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add .cumulative
to cumsum
& cumprod
docstrings
#9533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
I'm surprised they're more performant than using numpy's cumprod, cumsum directly. Is this because |
Yes! :) I don't have benchmarks at https://github.com/numbagg/numbagg, since But to confirm — 10x faster one over 10 columns, 2x faster over 1 column: [nav] In [36]: A = np.random.rand(60000, 10)
[ins] In [40]: %timeit np.cumsum(A, axis=0)
2.44 ms ± 82.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
[ins] In [47]: %timeit numbagg.move_sum(A, window=60000, min_count=0, axis=0)
371 µs ± 16.5 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) A = np.random.rand(60000, 1)
[nav] In [50]: %timeit np.cumsum(A, axis=0)
211 µs ± 5.34 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
[ins] In [49]: %timeit numbagg.move_sum(A, window=60000, min_count=0, axis=0)
106 µs ± 1.62 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each) |
we should clarify that then! for the equivalent non-numbagg benchmark you could use xarray with |
In the |
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
Merged, also improved the docs & suggested commands in |
As discussed in a couple of issues, we should be directing folks towards the
.cumulative
. (the only missing piece isskip_na
...I thought this was a reasonable way to have the generation script work for these; ofc open to feedback.
I also added the namedarray file to the instructions for generating