Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TorchFX] Multy-quantizers conformance check #38

Closed
wants to merge 16 commits into from

Conversation

daniil-lyakhov
Copy link
Owner

Changes

Reason for changes

Related tickets

Tests

### Changes

Enable mypy for nncf/common/pruning
AlexanderDokuchaev and others added 12 commits January 22, 2025 13:10
Disable pip cache to avoid issues with cache
### Changes

- Remove levit model_128 
- Fix: no catching exception without message
 
### Reason for changes

```
  File "/home/jenkins/agent/workspace/NNCF/manual/Daemons/PTQ/ptq_single_model/venv/lib/python3.10/site-packages/timm/models/_builder.py", line 393, in build_model_with_cfg
    load_pretrained(
  File "/home/jenkins/agent/workspace/NNCF/manual/Daemons/PTQ/ptq_single_model/venv/lib/python3.10/site-packages/timm/models/_builder.py", line 193, in load_pretrained
    state_dict = filter_fn(state_dict, model)
  File "/home/jenkins/agent/workspace/NNCF/manual/Daemons/PTQ/ptq_single_model/venv/lib/python3.10/site-packages/timm/models/levit.py", line 707, in checkpoint_filter_fn
    assert 'head' in ka or 'stem.conv1.linear' in ka
AssertionError

```
### Changes

black - 24.10.0
isort - 5.13.0
ruff - v0.9.2
markdownlint - 0.43.0
### Changes

Include SDPA metatype in scales unification map for all MinMax backends

### Reason for changes

Scales were not being unified in quantizers inserted for such a
subgraph:

```
x  y
\  |
 concat   Q   V
     |   /  /
      SDPA  
```

### Tests

Template test was created at
`tests/cross_fw/test_templates/test_unified_scales.py`
The template tests uses a synthetic SDPA model with a concat operation.
Then, it uses the method `_find_quantization_target_points` of
MinMaxQuantization algorithm to return the unified scale groups. These
groups are then used for assertion.
### Changes

Run test from precommit test scope in nightly trigger

- pt2: use fixed parameters for .T, _saved_dim0/1 depends from os, but
actual value always -1, -2
- onnx: build model only inside test function, not in collect
parameters, to more effective use memory
- common: use `1.2` s for test timer, to be sure that will be catch
00:00:01

NOTE: pt test falls in case of parallel run on build quantization
extension.

### Tests


https://github.com/openvinotoolkit/nncf/actions/runs/12835288745/job/35794407928?pr=3194
### Changes

Add quantization_aware_training_tensorflow_mobilenet_v2 to test scope
…lkit#3207)

### Changes

Changes the model from having the same input tensor to the QKV to having
different input for each

### Reason for changes

The former model was causing an error with openvino
…olkit#2727)

### Changes

- Implemented OpenVINO model graphs which are used for calculation of
compressed and decompressed weights. Since these models are compiled,
calculation become significantly faster especially for larger models and
int4 compression.
- This functionality is exposed by two methods at `weight_lowering.py`:
- `do_int_quantization()` is used for computing a compressed weight.
Possible signatures:
- `weight` -> `compressed_weight`, `scale`, (`zero_point` for asymmetric
compression)
- `weight`, `scale`, (`zero_point`) -> `compressed_weight`, `scale`,
(`zero_point`)
- `calculate_quantized_dequantized_weight()` is used for computing a
decompressed weight. Possible signatures:
    - `weight` -> `decompressed_weight`
    - `weight`, `scale`, (`zero_point`) -> `decompressed_weight`
- `weight` -> `decompressed_weight`, `compressed_weight`, `scale`,
(`zero_point`)
- `weight`, `scale`, (`zero_point`) -> `decompressed_weight`,
`compressed_weight`, `scale`, (`zero_point`)
- Output `scale` and `zero_point` are the same as the ones given as
input (if they were given at all).
- Computation is done via OV models only if openvino package is
installed and input tensors are not torch tensors.
- Introduce a new NNCF Tensor backend for storing instances of
`openvino.Tensor`. Implementation for this backend is limited by only
the required functionality, e.g. addition of OV Tensors is not supported
because it is not needed.
- Introduction of OV Tensors is required for seamless handling of
tensors in `bf16`, `u4` and `i4` data types. For example, `bf16`
constants are read from an OpenVINO LLM and given as inputs to a
compressing OpenVINO model. `u4` and `i4` compressed weights are
seamlessly inserted into the resulting compressed OpenVINO model.
- Added `as_numpy_tensor()` method to convert an NNCF Tensor to numpy
backend. Currently only OV -> NP conversion is required.
- All calculations are aligned with reference numpy implementation. Some
performance and memory sacrifices had to be made for such alignment.

Data-free asymmetric compression:

![image](https://github.com/user-attachments/assets/efd76b2f-1a3e-4037-8165-0bd5812de94d)

Data-free symmetric compression:

![image](https://github.com/user-attachments/assets/c61b98c6-cc96-4125-b21e-90c7d0827e22)

Data-aware compression:

![image](https://github.com/user-attachments/assets/b9823594-9915-4ca5-9e50-7bffa6777104)


### Reason for changes

Reducing model compression time. Only OpenVINO model compression backend
is affected.

### Related tickets

139047

### Tests

-
`tests/openvino/native/quantization/test_ov_modeling_compression.py::test_quantization_alignment`
-- check aligment with reference numpy implementation
- `tests/openvino/native/test_openvino_modeling.py` -- checks OV
modeling framework hyperparameters
- `tests/openvino/native/test_tensor.py` -- NNCF OV Tensor backend tests

Validation jobs:
- `NNCF/job/manual/job/post_training_weight_compression/299/`
- `NNCF/job/nightly/job/test_examples/650`
- OVVP validation ✅
- optimum-intel test job
https://github.com/huggingface/optimum-intel/actions/runs/12912964434/job/36009036879?pr=734
### Changes

- Added `nf4` precision for OV `GraphConverter`.

### Reason for changes

- `nf4` precision support.

### Related tickets

- 153357

### Tests

- Added
### Changes

Follow up to openvinotoolkit#2727

1. Do not use `infer_request.results`
2. Replace `>=` with `opset.greater_equal()`
3. Rename `ov_numeric.py` to `openvino_numeric.py`

### Reason for changes

1. Improve int4 compression time by up to ~10%
2. Avoid warning: `DeprecationWarning: greater_equal is deprecated and
will be removed in version 2025.3. Use ops.greater_equal instead`
3. Fix onnx install test

### Related tickets

139047

### Tests

- https://github.com/openvinotoolkit/nncf/actions/runs/12947249537
- NNCF/job/manual/job/post_training_weight_compression/301/
- NNCF/job/nightly/job/test_examples/653/
### Changes

- As stated in the title

### Reason for changes

- Upcoming release

### Related tickets

- 161230

### Tests

- N/A
### Changes

- Skip `test_non_convertable_division` on MacOS
- Cut values to 1e-10, more digits after zero dont change a result
- Use onle line `pip install` in gha 

### Reason for changes


https://github.com/openvinotoolkit/nncf/actions/runs/12969549430/job/36208496733

```
FAILED tests/openvino/native/test_node_utils.py::test_non_convertable_division[0.058599039912223816-15-True-0.003906603] - AssertionError: 
Not equal to tolerance rtol=0, atol=0

Mismatched elements: 1 / 1 (100%)
Max absolute difference among violations: 3.5297126e-07
Max relative difference among violations: 9.035248e-05
 ACTUAL: array([0.003906], dtype=float32)
 DESIRED: array([0.003907], dtype=float32)
FAILED tests/openvino/native/test_node_utils.py::test_non_convertable_division[0.058599039912223816-15-False-0.003906602505594492] - AssertionError: 
Not equal to tolerance rtol=0, atol=0

Mismatched elements: 1 / 1 (100%)
Max absolute difference among violations: 3.525056e-07
Max relative difference among violations: 9.023329e-05
 ACTUAL: array([0.003906], dtype=float32)
 DESIRED: array([0.003907], dtype=float32)
```

### Tests


https://github.com/openvinotoolkit/nncf/actions/runs/12991342528/job/36228616650

---------

Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/xnnpack_conformance branch from 5c12b78 to ba6b7ba Compare January 27, 2025 17:06
@daniil-lyakhov daniil-lyakhov force-pushed the dl/fx/xnnpack_conformance branch from ba6b7ba to 389c796 Compare January 27, 2025 18:03
@github-actions github-actions bot added documentation Improvements or additions to documentation NNCF TF labels Jan 27, 2025
daniil-lyakhov and others added 2 commits January 28, 2025 11:52
### Changes

* torch.ao `OpenVINOQuantizer` as well as `OpenVINOQuantizerAdapter` are
introduced
* `quantize_pt2e` function is updated to work with `OpenVINOQuantizer` 

### Reason for changes

* To enable OpenVINO quantization for torch.ao quantization pipelines
(`torch.ao.quantization.prepare_pt2e`,
`torch.ao.quantization.convert_pt2e`) and quantize_pt2e API function

### Related tickets

openvinotoolkit#2766 

### Tests

tests/torch/fx/test_quantizer.py is updated with use cases:
- `OpenVINOQuantizer` + `quantize_pt2e`
- `OpenVINOQuantizer` +`torch.ao.quantization.prepare_pt2e` ->
`torch.ao.quantization.convert_pt2e`
### Changes

- Introduced `PT2OpLayerAttribute`, to collect called function,
attributes and constant ports
- `FunctionMeta` stored function instead of function name

### Reason for changes

Needs to implement subgraph extractor for FBC

### Related tickets

152996

### Tests

tests/torch2/function_hook/nncf_graph/test_layer_attributes.py
No grad during the TorchFX model validation

quantization params are being forwarded to quantize_pt2e/OpenVINOQuantizer

[Conformance] Ultralytics yolov8n and yolo11n

XNNPACK conformance attempt

ARM and QUALCOMM eager backends

Shared quantization spec support

[HWConfig] narrow_range parameter is introduced in hardware config

Embedding qconfig list is extended for CPU devices / tests fixes

References update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants