Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support mixed INT8 + FP16 in one model #1798

Merged
merged 16 commits into from
May 20, 2024
Merged

Support mixed INT8 + FP16 in one model #1798

merged 16 commits into from
May 20, 2024

Conversation

yiliu30
Copy link
Collaborator

@yiliu30 yiliu30 commented May 17, 2024

Type of Change

feature
API changed or not

Description

Usage

model = export(model, example_inputs=example_inputs)

quant_config = get_default_static_config()
quant_config.set_local(torch.nn.Linear, StaticQuantConfig(w_dtype="fp16", act_dtype="fp16"))
# prepare
prepare_model = prepare(model, quant_config)
# calibrate
for i in range(2):
    prepare_model(*example_inputs)
# convert
converted_model = convert(prepare_model)

Expected Behavior & Potential Risk

How has this PR been tested?

Pre-CI
Some extension test later.

Dependency Change?

expecttest

yiliu30 added 11 commits May 7, 2024 17:26
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@yiliu30 yiliu30 marked this pull request as ready for review May 17, 2024 03:32
Copy link

github-actions bot commented May 17, 2024

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Code Scan Tests workflow
Check ID Status Error details
Code-Scan success
Code-Scan (Bandit Code Scan Bandit) success
Code-Scan (DocStyle Code Scan DocStyle) success
Code-Scan (Pylint Code Scan Pylint) success

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py, neural_compressor/torch/utils/utility.py.

🟢 Model Tests 3x workflow
Check ID Status Error details
Model-Test-3x success
Model-Test-3x (Generate Report GenerateReport) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml) success

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py, neural_compressor/torch/utils/utility.py.

🟢 Unit Tests 3x-PyTorch workflow
Check ID Status Error details
UT-3x-Torch success
UT-3x-Torch (Coverage Compare CollectDatafiles) success
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch) success
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline) success

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py, neural_compressor/torch/utils/utility.py, test/3x/torch/algorithms/pt2e_quant/test_half_precision_rewriter.py, test/3x/torch/quantization/test_pt2e_quant.py, test/3x/torch/requirements.txt.


Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

@yiliu30 yiliu30 requested a review from Kaihui-intel May 17, 2024 03:33
@yiliu30 yiliu30 added INC3.X PyTorch Related to PyTorch F/W PT2E labels May 17, 2024
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@xin3he
Copy link
Collaborator

xin3he commented May 17, 2024

Shall we add logger.info to teach user that ipex doesn't support fp16?
quant_config.set_local(torch.nn.Linear, StaticQuantConfig(w_dtype="fp16", act_dtype="fp16"))

@yiliu30
Copy link
Collaborator Author

yiliu30 commented May 17, 2024

Shall we add logger.info to teach user that ipex doesn't support fp16? quant_config.set_local(torch.nn.Linear, StaticQuantConfig(w_dtype="fp16", act_dtype="fp16"))

Thanks for your suggestions, will refine it by a separate PR.

Signed-off-by: yiliu30 <yi4.liu@intel.com>
@chensuyue chensuyue added this to the v2.6 milestone May 17, 2024
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@yiliu30 yiliu30 merged commit fa961e1 into master May 20, 2024
30 checks passed
@yiliu30 yiliu30 deleted the pt2e_fp16 branch May 20, 2024 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INC3.X PT2E PyTorch Related to PyTorch F/W
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants