Add bits and sym parameters to the OV quantization config #560

echarlaix · 2024-02-14T14:18:04Z

No description provided.

HuggingFaceDocBuilderDev · 2024-02-14T15:40:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

optimum/intel/openvino/configuration.py

optimum/exporters/openvino/convert.py

AlexKoff88 · 2024-02-15T05:43:58Z

@echarlaix, all the failures in CI are related to the fact that the GitHub version of NNCF enables weight quantization of Conv layers by default while the officially released version does not have it. Realistically, we will have a new NNCF version released by the end of Feb but we will create a release branch this week, so we can just install NNCF from the release branch and no changes in the test references are required.

optimum/intel/openvino/quantization.py

echarlaix · 2024-02-15T15:31:09Z

optimum/intel/openvino/modeling_decoder.py

+            default_config = _check_default_4bit_configs(config)
+
+            if default_config:
+                logger.info(f"For the given model, we recommend the following `quantization_config` : {default_config}")


I don't think we should overwrite the quantization_config as it's given by the user, so here we are just adding a warning for the user, wdyt @AlexKoff88 ?

Actually, this was an idea behind the combination of load_in_4bit + quantizaiton_config. If the later one is None we use the pre-defined default config, otherwise we override it an let the user to use a custom config.

echarlaix added 3 commits February 14, 2024 15:15

Move

bb766bf

format

c62d6ca

fix

4c98ac3

echarlaix added 3 commits February 14, 2024 17:18

add nncf version

6dd2a90

Fix config saving

050bc9f

add ov config test

efeea22

echarlaix commented Feb 14, 2024

View reviewed changes

optimum/intel/openvino/configuration.py Show resolved Hide resolved

AlexKoff88 reviewed Feb 15, 2024

View reviewed changes

optimum/exporters/openvino/convert.py Show resolved Hide resolved

AlexKoff88 reviewed Feb 15, 2024

View reviewed changes

optimum/intel/openvino/quantization.py Outdated Show resolved Hide resolved

echarlaix added 5 commits February 15, 2024 12:33

remove load_in_4bit argument

1237b87

add weight only quant for int8

70468a6

fix style

f1c9d6f

add nncf check

0abed19

remove _int4_weight_only_quantization

e661d44

echarlaix marked this pull request as ready for review February 15, 2024 14:42

echarlaix requested a review from AlexKoff88 February 15, 2024 15:22

fix typo

ce304c7

echarlaix commented Feb 15, 2024

View reviewed changes

make style

6be5fa6

echarlaix changed the title ~~Quant config nncf~~ Add bits and sym parameters to the OV quantization config Feb 15, 2024

AlexKoff88 approved these changes Feb 15, 2024

View reviewed changes

echarlaix merged commit 6c8fa79 into main Feb 15, 2024
10 of 12 checks passed

echarlaix deleted the quant-config-nncf branch February 15, 2024 16:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bits and sym parameters to the OV quantization config #560

Add bits and sym parameters to the OV quantization config #560

echarlaix commented Feb 14, 2024

HuggingFaceDocBuilderDev commented Feb 14, 2024

AlexKoff88 commented Feb 15, 2024

echarlaix Feb 15, 2024 •

edited

Loading

AlexKoff88 Feb 16, 2024

Add bits and sym parameters to the OV quantization config #560

Add bits and sym parameters to the OV quantization config #560

Conversation

echarlaix commented Feb 14, 2024

HuggingFaceDocBuilderDev commented Feb 14, 2024

AlexKoff88 commented Feb 15, 2024

echarlaix Feb 15, 2024 • edited Loading

Choose a reason for hiding this comment

AlexKoff88 Feb 16, 2024

Choose a reason for hiding this comment

echarlaix Feb 15, 2024 •

edited

Loading