Support mixed `INT8` + `FP16` in one model #1798

yiliu30 · 2024-05-17T03:31:45Z

Type of Change

feature
API changed or not

Description

Support convert unquantized linear into fp16
Extend the fp16 ops list to align with https://pytorch.org/docs/stable/amp.html#cpu-ops-that-can-autocast-to-bfloat16 in a separate PR

Usage

model = export(model, example_inputs=example_inputs)

quant_config = get_default_static_config()
quant_config.set_local(torch.nn.Linear, StaticQuantConfig(w_dtype="fp16", act_dtype="fp16"))
# prepare
prepare_model = prepare(model, quant_config)
# calibrate
for i in range(2):
    prepare_model(*example_inputs)
# convert
converted_model = convert(prepare_model)

Expected Behavior & Potential Risk

How has this PR been tested?

Pre-CI
Some extension test later.

Dependency Change?

expecttest

Signed-off-by: yiliu30 <yi4.liu@intel.com>

github-actions · 2024-05-17T03:33:09Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Code Scan Tests workflow

Check ID	Status
Code-Scan	success	✅
Code-Scan (Bandit Code Scan Bandit)	success	✅
Code-Scan (DocStyle Code Scan DocStyle)	success	✅
Code-Scan (Pylint Code Scan Pylint)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py, neural_compressor/torch/utils/utility.py.

🟢 Model Tests 3x workflow

Check ID	Status
Model-Test-3x	success	✅
Model-Test-3x (Generate Report GenerateReport)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py, neural_compressor/torch/utils/utility.py.

🟢 Unit Tests 3x-PyTorch workflow

Check ID	Status
UT-3x-Torch	success	✅
UT-3x-Torch (Coverage Compare CollectDatafiles)	success	✅
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch)	success	✅
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/pt2e_quant/core.py, neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py, neural_compressor/torch/utils/utility.py, test/3x/torch/algorithms/pt2e_quant/test_half_precision_rewriter.py, test/3x/torch/quantization/test_pt2e_quant.py, test/3x/torch/requirements.txt.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

Signed-off-by: yiliu30 <yi4.liu@intel.com>

xin3he · 2024-05-17T07:30:53Z

Shall we add logger.info to teach user that ipex doesn't support fp16?
quant_config.set_local(torch.nn.Linear, StaticQuantConfig(w_dtype="fp16", act_dtype="fp16"))

yiliu30 · 2024-05-17T07:52:00Z

Shall we add logger.info to teach user that ipex doesn't support fp16? quant_config.set_local(torch.nn.Linear, StaticQuantConfig(w_dtype="fp16", act_dtype="fp16"))

Thanks for your suggestions, will refine it by a separate PR.

Signed-off-by: yiliu30 <yi4.liu@intel.com>

test/3x/torch/quantization/test_pt2e_quant.py

yiliu30 added 11 commits May 7, 2024 17:26

poc for int8 + float16 on PT2E

9b1b2af

Signed-off-by: yiliu30 <yi4.liu@intel.com>

test compile

7c1d517

Signed-off-by: yiliu30 <yi4.liu@intel.com>

add pattern register

574bc97

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Merge branch 'master' into pt2e_fp16

723208e

Merge branch 'master' into pt2e_fp16

210fb04

refine the filter_fn

d5c970b

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Merge branch 'master' into pt2e_fp16

bbbe9bb

add config parse

bf19d8d

Signed-off-by: yiliu30 <yi4.liu@intel.com>

add UTs

5aa0530

Signed-off-by: yiliu30 <yi4.liu@intel.com>

rename test name

bce50f5

Signed-off-by: yiliu30 <yi4.liu@intel.com>

add expecttest

351418a

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 requested review from xin3he and zehao-intel May 17, 2024 03:32

yiliu30 marked this pull request as ready for review May 17, 2024 03:32

yiliu30 requested a review from Kaihui-intel May 17, 2024 03:33

yiliu30 added INC3.X PyTorch Related to PyTorch F/W PT2E labels May 17, 2024

fixed UTs

df562ca

Signed-off-by: yiliu30 <yi4.liu@intel.com>

zehao-intel approved these changes May 17, 2024

View reviewed changes

fixed the convert check

c5800eb

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fixed the set op type

30ea7b8

Signed-off-by: yiliu30 <yi4.liu@intel.com>

chensuyue added this to the v2.6 milestone May 17, 2024

yiliu30 added 2 commits May 19, 2024 09:24

disable set_local for quantizer

5b8d489

Signed-off-by: yiliu30 <yi4.liu@intel.com>

enhance ut

f3cb45a

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Kaihui-intel approved these changes May 20, 2024

View reviewed changes

test/3x/torch/quantization/test_pt2e_quant.py Show resolved Hide resolved

yiliu30 merged commit fa961e1 into master May 20, 2024
30 checks passed

yiliu30 deleted the pt2e_fp16 branch May 20, 2024 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support mixed `INT8` + `FP16` in one model #1798

Support mixed `INT8` + `FP16` in one model #1798

yiliu30 commented May 17, 2024 •

edited

github-actions bot commented May 17, 2024 •

edited

xin3he commented May 17, 2024

yiliu30 commented May 17, 2024

Support mixed INT8 + FP16 in one model #1798

Support mixed INT8 + FP16 in one model #1798

Conversation

yiliu30 commented May 17, 2024 • edited

Type of Change

Description

Usage

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

github-actions bot commented May 17, 2024 • edited

⚡ Required checks status: All passing 🟢

Groups summary

xin3he commented May 17, 2024

yiliu30 commented May 17, 2024

Support mixed `INT8` + `FP16` in one model #1798

Support mixed `INT8` + `FP16` in one model #1798

yiliu30 commented May 17, 2024 •

edited

github-actions bot commented May 17, 2024 •

edited