Lazy reductions should be data-parallel #158

inducer · 2021-08-27T14:19:15Z

Not currently a showstopper, but it comes up in mirgecom when logging stats about state and dependent variables.

cc @kaushikcfd @matthiasdiener @matthiasdiener @MTCam

kaushikcfd · 2021-08-27T14:37:50Z

Shouldn't this be a part of the downstream array contexts? An example for this could be what we do in meshmode:
https://github.com/inducer/meshmode/pull/248/files#diff-d5e55ef91478d86ac35923519f1d4556fa8502c33265a63f9cbf01579c26d461R528-R541

I.e. compute the reductions eagerly via PyOpenCL. (Not the best, but could be one way for a downstream array context to handle it.)

inducer · 2021-08-27T14:54:07Z

Yeah, you're right. We'll need to know something about array axes in order to effectively do the transformation. Moving to grudge (which is where the reductions are being introduced).

inducer · 2021-08-27T18:20:52Z

Oh, TIL about the eager reductions. So that means we can force our reductions to be parallel (via eager) if we pre-freeze/thaw all inputs?

@kaushikcfd Do you have a sense how often _can_be_eagerly_computed is actually true in practice?

This might do in the short term, however in the long term, I'm fairly sure we want to be able to fuse all those reductions, so that we only need to load all that vector data once, and at that point, properly transforming the reductions becomes unavoidable.

kaushikcfd · 2021-08-27T18:37:05Z

@kaushikcfd Do you have a sense how often _can_be_eagerly_computed is actually true in practice?

At least for our drivers no reduction instructions go un-parallelized.

This might do in the short term

Yep for sure, this is a placeholder until we have a better approach. A common pattern seen in our drivers is to get the max/min of a single quantity (pressure for ex.). There's no reason we should have 2 kernel launches for that.

inducer · 2021-08-27T18:51:06Z

our drivers

Just to be clear: Which drivers do you mean by that?

inducer transferred this issue from inducer/arraycontext Aug 27, 2021

inducer added the performance label Aug 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy reductions should be data-parallel #158

Lazy reductions should be data-parallel #158

inducer commented Aug 27, 2021

kaushikcfd commented Aug 27, 2021

inducer commented Aug 27, 2021

inducer commented Aug 27, 2021

kaushikcfd commented Aug 27, 2021

inducer commented Aug 27, 2021

Lazy reductions should be data-parallel #158

Lazy reductions should be data-parallel #158

Comments

inducer commented Aug 27, 2021

kaushikcfd commented Aug 27, 2021

inducer commented Aug 27, 2021

inducer commented Aug 27, 2021

kaushikcfd commented Aug 27, 2021

inducer commented Aug 27, 2021