Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NF4 per-channel support for AWQ and Scale Estimation #2898

Merged
merged 6 commits into from
Sep 23, 2024

Conversation

ljaljushkin
Copy link
Contributor

@ljaljushkin ljaljushkin commented Aug 21, 2024

Changes

Supported NF4 mode for Scale Estimation and AWQ.
All results below were collected w/ and w/o Scale estimation algorithms and w/ Lora Correction algorithm.

image

Reason for changes

NF4 per-channel with scale estimation may give promising results for NPU, since the accuracy is on par with int4 group-wise quantization.

Related tickets

150560

Tests

  • OV 2024.5
    job/NNCF/job/manual/job/post_training_weight_compression/182
    image
  • OV 2024.4
    job/NNCF/job/manual/job/post_training_weight_compression/181
    image
  • OV 2024.3
    job/NNCF/job/manual/job/post_training_weight_compression/180
    image

@github-actions github-actions bot added NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ labels Aug 21, 2024
@ljaljushkin ljaljushkin marked this pull request as ready for review August 21, 2024 17:52
@ljaljushkin ljaljushkin requested a review from a team as a code owner August 21, 2024 17:52
Copy link
Collaborator

@andreyanufr andreyanufr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can change code in scale_estimation to avoid usage of calculate_normalized_weight_and_fp4_scale, but it requires bigger refactoring.

@ljaljushkin ljaljushkin changed the title NF4 support for AWQ and Scale Estimation [WIP] NF4 support for AWQ and Scale Estimation Aug 23, 2024
Copy link
Collaborator

@andreyanufr andreyanufr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong target in scale estimation with nf4

@MaximProshin
Copy link
Collaborator

@ljaljushkin , can you actualize it?

@ljaljushkin ljaljushkin force-pushed the nl/scale_est_nf4_squash branch from 97703b8 to b6f840d Compare September 10, 2024 15:28
@ljaljushkin ljaljushkin changed the title [WIP] NF4 support for AWQ and Scale Estimation NF4 support for AWQ and Scale Estimation Sep 10, 2024
@ljaljushkin
Copy link
Contributor Author

ljaljushkin commented Sep 10, 2024

wrong target in scale estimation with nf4

@andreyanufr Fixed, added test for that. Please take a look:
https://github.com/openvinotoolkit/nncf/pull/2898/files#diff-223ea638f7751f7c0c3e8f867ec9c8c132a3ccd62a9dcea2a5d158836c71c222R894

@ljaljushkin
Copy link
Contributor Author

ljaljushkin commented Sep 11, 2024

The word perplexity for per-channel NF4 quantization with Scale Estimation is reproduced with the latest changes in PR and comparable with what I got with the same changes before merge with develop: 10.8850 vs 10.8830
image

@MaximProshin MaximProshin changed the title NF4 support for AWQ and Scale Estimation NF4 per-channel support for AWQ and Scale Estimation Sep 12, 2024
@ljaljushkin
Copy link
Contributor Author

Rebased on latest changes.
@andreyanufr @alexsu52 any blocking comments for merge?

@alexsu52 alexsu52 merged commit 05f37f5 into openvinotoolkit:develop Sep 23, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants