Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should global reductions compile internally? #274

Open
majosm opened this issue Jul 7, 2022 · 1 comment
Open

Should global reductions compile internally? #274

majosm opened this issue Jul 7, 2022 · 1 comment

Comments

@majosm
Copy link
Collaborator

majosm commented Jul 7, 2022

Global reductions such as nodal_min must compute and evaluate the local reduction before passing the result to MPI. The code for these local reductions is not compiled, so It seems like they would incur the cost of generating the kernel each time the reduction is called? Should they perhaps have a memoized compile on the inside similar to compiled_lsrk45_step instead?

(Not sure if this has any real performance impact, I just noticed it while discussing with @MTCam and thought it was worth mentioning.)

@inducer
Copy link
Owner

inducer commented Jul 10, 2022

This is a good question. There certainly is a performance impact, and so we definitely want to move away from the current approach of using freeze/thaw. That said, there is not currently code to deal with reductions as far as DAG transformation is concerned; there are two specific pieces missing:

  • We likely want to expose distributed-memory (effectively MPI) reduce/allreduce as a DAG node. We could roll our own using point-to-point and a tree to avoid the need for a new node type, but I think that's not a good idea.
  • Single-GPU global reductions need a transform path. Loopy can do those transformations, we just need to make sure they happen.

I don't think compiling internally is a good idea, as it would effectively cement the notion that reductions are evaluated eagerly. That would preclude incorporating reductions into larger DAGs, while I think that's actually desirable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants