ENH: sparse: Speed up `_add_sparse` for DIA format #20722

perimosocordiae · 2024-05-16T06:11:42Z

This PR rewrites the _add_sparse logic for DIA-format sparse arrays/matrices to avoid expensive allocations and Python looping.

My local benchmarks show that this is faster, in the easy case where the sparsity pattern doesn't change, but also in the more complex case where the sets of diagonal offsets are non-overlapping.

Generated via:
bench.py

perimosocordiae · 2024-05-16T16:25:31Z

I ran the asv benchmarks, though notably I had to remove the --quick flag from our dev.py runner first. If running in --quick mode, the results are too noisy to be useful.

Change	Before [`8ce393f`]	After [`8eb2614`]	Ratio	Benchmark (Parameter)
-	811±50μs	265±40μs	0.33	sparse.Arithmetic.time_arithmetic('dia', 'AA', 'add')

In these benchmarks the A matrix is in DIA format, and B is in CSR format. So the AB, BA, and BB tests aren't exercising the updated code path, which is why they don't show up in the timing diffs.

dschult

The code changes here look good!
[I especially like _invert_index which I've done for lists and dicts, but not seen before for numpy.]

The bench.py file checks all-overlap and no-overlap, so I tried some-overlap. Most of these cases were also faster than the main branch. But in one case , the new code is slightly slower than the main branch. I don't believe it is significant enough to keep the old methods. But it might be worth trying the suggestion to check both self and other for whether they include all new offsets.

The above makes B a 3-diagonal poisson1d_in2d constructor with offsets [-1, 0, 1]. Thus they are a subset of the offsets for the poisson2d A.

Most of the variants I tried end up looking like the following, with the new code faster than the main branch. These have B being a poisson2d with offsets [-N, -2, 0, 2, N]. And the same A and B as the some-overlap case above, only B+A instead of A+B. That is what made me think of symmetrizing the new code -- but the main branch code is not symmetric anyway -- B+A is more than 50% different from A+B sometimes.

scipy/sparse/_dia.py

perimosocordiae · 2024-05-18T02:40:35Z

Thanks for pointing out the opportunity for the other direction's overlapping-offsets fast path, @dschult. I ran a before and after using a similar setup to what you described, where c's offsets are a subset of a's offsets:

Before 1eba3cb:

After 1eba3cb:

dschult

This looks good to me!

rgommers

Looks like a nice improvement! Thanks @perimosocordiae, and thanks for the review @dschult

tylerjereddy · 2024-05-26T22:27:46Z

drafted the release note

ENH: sparse: Speed up __add__ for DIA format

8eb2614

github-actions bot added scipy.sparse enhancement A new feature or improvement labels May 16, 2024

dschult reviewed May 16, 2024

View reviewed changes

scipy/sparse/_dia.py Show resolved Hide resolved

scipy/sparse/_dia.py Show resolved Hide resolved

Add suggestion from @dschult

1eba3cb

dschult approved these changes May 18, 2024

View reviewed changes

lucascolley changed the title ~~ENH: sparse: Speed up _add_sparse for DIA format~~ ENH: sparse: Speed up _add_sparse for DIA format May 18, 2024

lucascolley added this to the 1.14.0 milestone May 18, 2024

rgommers approved these changes May 25, 2024

View reviewed changes

rgommers merged commit a4bbcb5 into scipy:main May 25, 2024
31 checks passed

lucascolley mentioned this pull request May 30, 2024

DOC: sparse: 1.14.0 release notes additions #20838

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: sparse: Speed up `_add_sparse` for DIA format #20722

ENH: sparse: Speed up `_add_sparse` for DIA format #20722

perimosocordiae commented May 16, 2024

perimosocordiae commented May 16, 2024

dschult left a comment

perimosocordiae commented May 18, 2024

dschult left a comment

rgommers left a comment

tylerjereddy commented May 26, 2024

ENH: sparse: Speed up _add_sparse for DIA format #20722

ENH: sparse: Speed up _add_sparse for DIA format #20722

Conversation

perimosocordiae commented May 16, 2024

perimosocordiae commented May 16, 2024

dschult left a comment

Choose a reason for hiding this comment

perimosocordiae commented May 18, 2024

dschult left a comment

Choose a reason for hiding this comment

rgommers left a comment

Choose a reason for hiding this comment

tylerjereddy commented May 26, 2024

ENH: sparse: Speed up `_add_sparse` for DIA format #20722

ENH: sparse: Speed up `_add_sparse` for DIA format #20722