fix INT8 prepare function #389

pacman100 · 2023-05-02T07:25:35Z

What does this PR do?

Fixes following issues with INT8 training:
a. When making modules trainable via modules_to_save, it was resulting in loss being NaN or 0 because those params are in fp16/bf16 which are unstable for training, in comparison trainable lora params are in FP32.
b. half and float mismatches because certain layernorm were being converted in to FP32 and when making other trainable modules to be in FP32 manually, it was leading to mismatches as frozen layers were in half prescision and few in full precision.

HuggingFaceDocBuilderDev · 2023-05-02T07:32:05Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada

Thanks a lot for fixing! Can you confirm the int8 examples slow tests pass with theses changes?

review-notebook-app · 2023-05-03T06:29:39Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

pacman100 · 2023-05-03T07:06:27Z

All the tests, single gpu, multi gpu, core and common pass

* fix INT8 prepare function * remove unused function args * fix related tests, examples and docs

`dataloader must be a torch.utils.data.Dataset`: `dataloader` should be `dataset`

fix INT8 prepare function

9035353

pacman100 requested a review from younesbelkada May 2, 2023 07:25

remove unused function args

8959735

younesbelkada approved these changes May 2, 2023

View reviewed changes

fix related tests, examples and docs

f701736

pacman100 merged commit 1a1cfe3 into main May 3, 2023

pacman100 deleted the smangrul/fix-int8-prepare branch May 3, 2023 19:38

rationalism mentioned this pull request May 9, 2023

PR #389 greatly slows 8-bit LoRA training (via bitsandbytes) #422

Closed

rationalism mentioned this pull request Aug 5, 2023

PR #389 breaks Flash Attention 2 with peft #790

Closed

4 tasks

Birch-san mentioned this pull request Aug 9, 2023

Question: why does prepare_model_for_kbit_training cast input **and** output embeddings to float32? #816

Closed

4 tasks

pacman100 mentioned this pull request Aug 16, 2023

add torch_dtype control for prepare_model_for_kbit_training #828

Closed

Guy-Bilitski pushed a commit to Guy-Bilitski/peft that referenced this pull request May 13, 2025

fix INT8 prepare function (huggingface#389)

22a25ef

* fix INT8 prepare function * remove unused function args * fix related tests, examples and docs

cyyever pushed a commit to cyyever/peft that referenced this pull request Sep 4, 2025

fix typo in ppo_trainer.py (huggingface#389)

86c1174

`dataloader must be a torch.utils.data.Dataset`: `dataloader` should be `dataset`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix INT8 prepare function #389

fix INT8 prepare function #389

pacman100 commented May 2, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 2, 2023 •

edited

Loading

Uh oh!

younesbelkada left a comment

Uh oh!

review-notebook-app bot commented May 3, 2023

Uh oh!

pacman100 commented May 3, 2023

Uh oh!

fix INT8 prepare function #389

fix INT8 prepare function #389

Conversation

pacman100 commented May 2, 2023

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented May 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

review-notebook-app bot commented May 3, 2023

Uh oh!

pacman100 commented May 3, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 2, 2023 •

edited

Loading