Gptq refactor #1770

xin3he · 2024-04-30T12:01:23Z

Type of Change

feature
API changed or not: yes

Description

Reduce additional initialization of quantizer between prepare and convert
Before:

Now:
Reduce additional initialization of quantizer between prepare and convert
Migrate GPTQ to Torch new 3x API

GPTQ

from neural_compressor.torch.quantization import get_default_gptq_config, prepare, convert
quant_config = get_default_gptq_config()
model= prepare(model, quant_config)
run_fn(model)
q_model = convert(model)

How has this PR been tested?

Pre-ci

github-actions · 2024-04-30T12:01:47Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Code Scan Tests workflow

Check ID	Status
Code-Scan	success	✅
Code-Scan (Bandit Code Scan Bandit)	success	✅
Code-Scan (DocStyle Code Scan DocStyle)	success	✅
Code-Scan (Pylint Code Scan Pylint)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/quantization/algorithm_entry.py.

🟢 Model Tests 3x workflow

Check ID	Status
Model-Test-3x	success	✅
Model-Test-3x (Generate Report GenerateReport)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb)	success	✅
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/quantization/algorithm_entry.py.

🟢 Unit Tests 3x-PyTorch workflow

Check ID	Status
UT-3x-Torch	success	✅
UT-3x-Torch (Coverage Compare CollectDatafiles)	success	✅
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch)	success	✅
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline)	success	✅

These checks are required after the changes to neural_compressor/torch/algorithms/weight_only/gptq.py, neural_compressor/torch/quantization/algorithm_entry.py, test/3x/torch/quantization/weight_only/test_gptq.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

xin3he · 2024-05-06T04:57:01Z

Final decision:

neural_compressor/torch/quantization/algorithm_entry.py

test/3x/torch/quantization/weight_only/test_gptq.py

yiliu30 · 2024-05-06T06:21:34Z

One more thing, suggest merging it into master directly.

Signed-off-by: xin3he <xin3.he@intel.com>

xin3he requested review from Kaihui-intel, yuwenzho and yiliu30 and removed request for yuwenzho April 30, 2024 12:01

yiliu30 approved these changes May 6, 2024

View reviewed changes

neural_compressor/torch/quantization/algorithm_entry.py Outdated Show resolved Hide resolved

test/3x/torch/quantization/weight_only/test_gptq.py Show resolved Hide resolved

xin3he added 3 commits May 6, 2024 16:37

refactor gptq with prepare and convert API

c8f7ac7

Signed-off-by: xin3he <xin3.he@intel.com>

fix bug

9bba089

Signed-off-by: xin3he <xin3.he@intel.com>

update quantizer and model relationship

fd2adde

Signed-off-by: xin3he <xin3.he@intel.com>

xin3he force-pushed the gptq_refactor branch from bae964a to fd2adde Compare May 6, 2024 08:41

xin3he changed the base branch from yuwenzho/refactor_rtn_hqq_awq to master May 6, 2024 09:34

xin3he added 2 commits May 6, 2024 17:39

fix bug

40d8068

Signed-off-by: xin3he <xin3.he@intel.com>

add UT for quantize API

7b320af

Signed-off-by: xin3he <xin3.he@intel.com>

yuwenzho approved these changes May 7, 2024

View reviewed changes

xin3he merged commit 84d7055 into master May 7, 2024
30 checks passed

xin3he deleted the gptq_refactor branch May 7, 2024 08:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gptq refactor #1770

Gptq refactor #1770

xin3he commented Apr 30, 2024 •

edited

Loading

github-actions bot commented Apr 30, 2024 •

edited

Loading

xin3he commented May 6, 2024

yiliu30 commented May 6, 2024

Gptq refactor #1770

Gptq refactor #1770

Conversation

xin3he commented Apr 30, 2024 • edited Loading

Type of Change

Description

How has this PR been tested?

github-actions bot commented Apr 30, 2024 • edited Loading

⚡ Required checks status: All passing 🟢

Groups summary

xin3he commented May 6, 2024

yiliu30 commented May 6, 2024

xin3he commented Apr 30, 2024 •

edited

Loading

github-actions bot commented Apr 30, 2024 •

edited

Loading