-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update requirements, restructure files and fix formatting for VAE example #3046
Conversation
I see the linting issues. Will get them fixed |
Thanks @canyon289 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks!
Some minor comments inline.
# Make sure tf does not allocate gpu memory. | ||
tf.config.experimental.set_visible_devices([], 'GPU') | ||
train.train_and_evaluate(FLAGS) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we usually have two empty lines before this block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant: we usually have single empty lines within functions (lines 55-56, 59-60 should both be a single line), but we have double empty lines between "top-level codeblocks" (line 65 should be two empty lines)
It's fine to clean-up README in a second step, but could you already update it to run |
Tested that everything installs und runs (with expected loss) in a fresh Colab CPU runtime: |
Thanks @andsteing for all the comments. I'll address them all |
Codecov Report
@@ Coverage Diff @@
## main #3046 +/- ##
=======================================
Coverage 81.97% 81.97%
=======================================
Files 55 55
Lines 6031 6031
=======================================
Hits 4944 4944
Misses 1087 1087 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
return VAE(latents=FLAGS.latents) | ||
|
||
|
||
@jax.jit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question for understanding, should the train step be jitted? I looked at other examples and that didn't seem to be the case.
https://github.com/google/flax/blob/main/examples/wmt/train.py#L166
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's jitted here:
Line 528 in df66c81
p_train_step = jax.pmap( |
(the pmap()
transform also compiles the code like jit()
, but at the same time parallelizes it onto multiple devices)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the train step be jitted?
to answer you question: yes, you should always compile the largest possble code block; usually that's the train_step()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I missed it that, was looking for the decorator. Thank you @andsteing
Do let me know if there's anything I missed. Happy to keep working on this |
Im still committed to finishing this! |
@canyon289 if you would like reviewers to have another look at the PR, you can click on the (otherwise busy reviewers might not read all the individual updates and miss the PR until that button is pressed) |
Tested that everything installs und runs (with expected loss) in a fresh Colab CPU runtime: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my side.
(one open comment about empty lines, otherwise good to submit)
# Make sure tf does not allocate gpu memory. | ||
tf.config.experimental.set_visible_devices([], 'GPU') | ||
train.train_and_evaluate(FLAGS) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant: we usually have single empty lines within functions (lines 55-56, 59-60 should both be a single line), but we have double empty lines between "top-level codeblocks" (line 65 should be two empty lines)
You're right, updated! Thank you |
I dont have the rights in this repo to rerequest review from pending reviewers for some reason, I wouldnt mind another review if someone wants to, but dont want to obligate anybody. Happy to have this merged and keep moving on from there |
Squashed to single commit and force pushed |
@marcvanzee and @levskaya PTAL when you have time please |
oh, that's good to know! I must have missed your message previously, since nobody else commented on the PR, let's move forward. |
What does this PR do?
Addressing a portion of the issues in #573
Putting up for review early to ensure I'm trending in the right direction and what else would be needed for a merge
Fixes
I can fix up colab, readme, clu, model configs, and adding tests in separate PRs if you don't mind