-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstracting away the gpu #46
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Could you rebase to master? There was a small change in docs/requirements.txt. |
7c8eae8
to
8ae14b8
Compare
Thanks a lot for your work! I'll look at the code soon. Responses to your questions:
|
class PerseusOptimizer(Callback):
Here, is device_id reindexed to respect CUDA_VISIBLE_DEVICES? for example, if there are four GPUs on system and CUDA_VISIBLE_DEVICES="0,2", to access GPU with CUDA_VISIBLE_DEVICE "2", does it reindex it to be 1 or keep it as 2? Because ZeusMonitor reindexes, but from the comment above I wasn't sure ^ after doing some research, I assume its reindexed. Finished optimizer.py with this assumption. |
Ready for review! Completed tasks: ✅ remove cuda_sync and setDevice from gpu, put it back where it belongs ✅ Fixed set Persistence Mode, added TODO comment back to where set Persistence mode SYS_admin check was ✅ Fixed set PowerManagementLimit - originally broke API consistency. Added resetPowerManagementLimit function. ✅ Fixed names of exceptions BIG fixes: STYLE Fixes: ✅ made sure signature of init all match Questions and notes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm outside so I just added two comments. I'll continue.
Exception names look good! But is there a need to export all of the exceptions in |
So initially I was only exporting the exceptions used in the source code. But you said not to selectively export some of them, so I added all of them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small change suggestions. After they're all addressed (or accepted), I think we'd be good to merge without further changes. Thanks again for the great work. Now it's so much easier to interface with the GPU!!
Co-authored-by: Jae-Won Chung <jwnchung@umich.edu>
Co-authored-by: Jae-Won Chung <jwnchung@umich.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's fix the doc error on master. LGTM, thanks!!
PR is mostly done, few notes:
getTotalEnergyConsumption
should be lowercase". We decided on camel case to match pynvml methods, so what should be done to resolve this?