Skip to content

Conversation

pacman100
Copy link
Contributor

What does this PR do?

  1. Fixes following issues with INT8 training:
    a. When making modules trainable via modules_to_save, it was resulting in loss being NaN or 0 because those params are in fp16/bf16 which are unstable for training, in comparison trainable lora params are in FP32.
    b. half and float mismatches because certain layernorm were being converted in to FP32 and when making other trainable modules to be in FP32 manually, it was leading to mismatches as frozen layers were in half prescision and few in full precision.

@pacman100 pacman100 requested a review from younesbelkada May 2, 2023 07:25
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 2, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for fixing! Can you confirm the int8 examples slow tests pass with theses changes?

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@pacman100
Copy link
Contributor Author

All the tests, single gpu, multi gpu, core and common pass

@pacman100 pacman100 merged commit 1a1cfe3 into main May 3, 2023
@pacman100 pacman100 deleted the smangrul/fix-int8-prepare branch May 3, 2023 19:38
Guy-Bilitski pushed a commit to Guy-Bilitski/peft that referenced this pull request May 13, 2025
* fix INT8 prepare function

* remove unused function args

* fix related tests, examples and docs
cyyever pushed a commit to cyyever/peft that referenced this pull request Sep 4, 2025
`dataloader must be a torch.utils.data.Dataset`: `dataloader` should be `dataset`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants