Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: sparse: Speed up _add_sparse for DIA format #20722

Merged
merged 2 commits into from
May 25, 2024

Conversation

perimosocordiae
Copy link
Member

This PR rewrites the _add_sparse logic for DIA-format sparse arrays/matrices to avoid expensive allocations and Python looping.

My local benchmarks show that this is faster, in the easy case where the sparsity pattern doesn't change, but also in the more complex case where the sets of diagonal offsets are non-overlapping.

no_overlap
trivial

Generated via:
bench.py

@github-actions github-actions bot added scipy.sparse enhancement A new feature or improvement labels May 16, 2024
@perimosocordiae
Copy link
Member Author

I ran the asv benchmarks, though notably I had to remove the --quick flag from our dev.py runner first. If running in --quick mode, the results are too noisy to be useful.

Change Before [8ce393f] After [8eb2614] Ratio Benchmark (Parameter)
- 811±50μs 265±40μs 0.33 sparse.Arithmetic.time_arithmetic('dia', 'AA', 'add')

In these benchmarks the A matrix is in DIA format, and B is in CSR format. So the AB, BA, and BB tests aren't exercising the updated code path, which is why they don't show up in the timing diffs.

Copy link
Contributor

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes here look good!
[I especially like _invert_index which I've done for lists and dicts, but not seen before for numpy.]

The bench.py file checks all-overlap and no-overlap, so I tried some-overlap. Most of these cases were also faster than the main branch. But in one case , the new code is slightly slower than the main branch. I don't believe it is significant enough to keep the old methods. But it might be worth trying the suggestion to check both self and other for whether they include all new offsets.

The above makes B a 3-diagonal poisson1d_in2d constructor with offsets [-1, 0, 1]. Thus they are a subset of the offsets for the poisson2d A.

Most of the variants I tried end up looking like the following, with the new code faster than the main branch. These have B being a poisson2d with offsets [-N, -2, 0, 2, N]. And the same A and B as the some-overlap case above, only B+A instead of A+B. That is what made me think of symmetrizing the new code -- but the main branch code is not symmetric anyway -- B+A is more than 50% different from A+B sometimes.

scipy/sparse/_dia.py Show resolved Hide resolved
scipy/sparse/_dia.py Show resolved Hide resolved
@perimosocordiae
Copy link
Member Author

Thanks for pointing out the opportunity for the other direction's overlapping-offsets fast path, @dschult. I ran a before and after using a similar setup to what you described, where c's offsets are a subset of a's offsets:

Before 1eba3cb:
before

After 1eba3cb:
after

Copy link
Contributor

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

@lucascolley lucascolley changed the title ENH: sparse: Speed up _add_sparse for DIA format ENH: sparse: Speed up _add_sparse for DIA format May 18, 2024
@lucascolley lucascolley added this to the 1.14.0 milestone May 18, 2024
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice improvement! Thanks @perimosocordiae, and thanks for the review @dschult

@rgommers rgommers merged commit a4bbcb5 into scipy:main May 25, 2024
31 checks passed
@tylerjereddy
Copy link
Contributor

drafted the release note

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants