FP8 types support in NNCF graph building #3344

alexsu52 · 2025-03-14T15:34:14Z

Changes

Added support for "f8e4m3" and "f8e5m2" types in NNCF graph building.
Extended TensorDataType to TensorDataType.f8e5m2, TensorDataType.f8e4m3, TensorDataType.nf4

Reason for changes

Support for compression of models with fp8 and nf4 weights.

Related tickets

ref: 164161

Tests

test_compare_nncf_graph_precision_synthetic_models

nikita-savelyevv · 2025-03-17T15:23:39Z

Which usage scenario does this PR enable? Compression/quantization of already fp8-quantized models?

jane-intel

This PR enabled us to convert DeepSeek R1 model (with original f8 weights) via optimum intel. Thank you.

nikita-savelyevv

If we claim FP8 model support for weight compression, we should add corresponding tests. For example here: tests/openvino/native/quantization/test_weights_compression.py::TestActivationWeightDtype.test_compression_for_different_dtypes.

As I understand only fp8 support is required for now. I would suggest not to claim nf4 support just yet because some additional effort is required to enable it. This is because Tensor.reshape() needs to be implemented for nf4 nncf Tensors in ov backend.

~~FP8 compression should work after alexsu52#20~~

Edit:
Discussed in alexsu52#20

nncf/tensor/definitions.py

alexsu52 · 2025-03-18T06:22:30Z

Which usage scenario does this PR enable? Compression/quantization of already fp8-quantized models?

Skip fp8 weights during compression. Before this fix, NNCF throw a runtime error:

alexsu52 added 2 commits March 14, 2025 19:00

added support fp8 types from OpenVINO

e6c1f43

references

43f88ea

github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Mar 14, 2025

alexsu52 marked this pull request as ready for review March 17, 2025 05:24

alexsu52 requested a review from a team as a code owner March 17, 2025 05:24

move reference graphs to 2025.0

160d632

alexsu52 requested review from l-bat and nikita-savelyevv March 17, 2025 05:27

alexsu52 added the Code Freeze label Mar 17, 2025

jane-intel approved these changes Mar 17, 2025

View reviewed changes

nikita-savelyevv requested changes Mar 17, 2025

View reviewed changes

nncf/tensor/definitions.py Show resolved Hide resolved

alexsu52 changed the title ~~FP8 types support~~ FP8 types support in NNCF graph building Mar 18, 2025

replied to comments

d6b41c1

github-actions bot added the NNCF PTQ Pull requests that updates NNCF PTQ label Mar 18, 2025

nikita-savelyevv approved these changes Mar 18, 2025

View reviewed changes

alexsu52 merged commit 833f13c into openvinotoolkit:develop Mar 18, 2025
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP8 types support in NNCF graph building #3344

FP8 types support in NNCF graph building #3344

alexsu52 commented Mar 14, 2025 •

edited

Loading

nikita-savelyevv commented Mar 17, 2025

jane-intel left a comment

nikita-savelyevv left a comment •

edited

Loading

alexsu52 commented Mar 18, 2025 •

edited

Loading

FP8 types support in NNCF graph building #3344

FP8 types support in NNCF graph building #3344

Conversation

alexsu52 commented Mar 14, 2025 • edited Loading

Changes

Reason for changes

Related tickets

Tests

nikita-savelyevv commented Mar 17, 2025

jane-intel left a comment

Choose a reason for hiding this comment

nikita-savelyevv left a comment • edited Loading

Choose a reason for hiding this comment

alexsu52 commented Mar 18, 2025 • edited Loading

alexsu52 commented Mar 14, 2025 •

edited

Loading

nikita-savelyevv left a comment •

edited

Loading

alexsu52 commented Mar 18, 2025 •

edited

Loading