Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Single File] Add single file support for Flux Transformer #9083

Merged
merged 5 commits into from
Aug 7, 2024

Conversation

DN6
Copy link
Collaborator

@DN6 DN6 commented Aug 5, 2024

What does this PR do?

Add single file support for the Flux Transformer model.

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Sorry, something went wrong.

DN6 added 3 commits August 2, 2024 12:53
@DN6 DN6 changed the title [Single File] Add single file support for Flux [Single File] Add single file support for Flux Transformer Aug 5, 2024
Comment on lines +1877 to +1878
mlp_ratio = 4.0
inner_dim = 3072
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be okay to hardcode these no? They are the same across both models. We can grab inner_dim from the checkpoint, but not sure about mlp_ratio?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @yiyixuxu

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think okay to hard code. I would just define them as proper constants at the top of the file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if they're only applied within the scope of this function, they don't need to exist as constants that can be accessed globally?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

mlp_ratio = 4.0
inner_dim = 3072

# in SD3 original implementation of AdaLayerNormContinuous, it split linear projection output into shift, scale;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can leave outside of the function no since SD3 also uses it?

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Two things:

  • Let's add it to the list of models that support single file loading.
  • Let's see if FP8 support works? If so, let's attach a code snippet?

DN6 and others added 2 commits August 6, 2024 10:25
@sayakpaul
Copy link
Member

@DN6 let's merge this after the TODOs?

@DN6 DN6 merged commit e1b603d into main Aug 7, 2024
18 checks passed
sayakpaul added a commit that referenced this pull request Dec 23, 2024
* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants