forked from openvinotoolkit/nncf
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TorchFX] Multy-quantizers conformance check #38
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Changes Enable mypy for nncf/common/pruning
Disable pip cache to avoid issues with cache
### Changes - Remove levit model_128 - Fix: no catching exception without message ### Reason for changes ``` File "/home/jenkins/agent/workspace/NNCF/manual/Daemons/PTQ/ptq_single_model/venv/lib/python3.10/site-packages/timm/models/_builder.py", line 393, in build_model_with_cfg load_pretrained( File "/home/jenkins/agent/workspace/NNCF/manual/Daemons/PTQ/ptq_single_model/venv/lib/python3.10/site-packages/timm/models/_builder.py", line 193, in load_pretrained state_dict = filter_fn(state_dict, model) File "/home/jenkins/agent/workspace/NNCF/manual/Daemons/PTQ/ptq_single_model/venv/lib/python3.10/site-packages/timm/models/levit.py", line 707, in checkpoint_filter_fn assert 'head' in ka or 'stem.conv1.linear' in ka AssertionError ```
### Changes black - 24.10.0 isort - 5.13.0 ruff - v0.9.2 markdownlint - 0.43.0
### Changes Include SDPA metatype in scales unification map for all MinMax backends ### Reason for changes Scales were not being unified in quantizers inserted for such a subgraph: ``` x y \ | concat Q V | / / SDPA ``` ### Tests Template test was created at `tests/cross_fw/test_templates/test_unified_scales.py` The template tests uses a synthetic SDPA model with a concat operation. Then, it uses the method `_find_quantization_target_points` of MinMaxQuantization algorithm to return the unified scale groups. These groups are then used for assertion.
### Changes Run test from precommit test scope in nightly trigger - pt2: use fixed parameters for .T, _saved_dim0/1 depends from os, but actual value always -1, -2 - onnx: build model only inside test function, not in collect parameters, to more effective use memory - common: use `1.2` s for test timer, to be sure that will be catch 00:00:01 NOTE: pt test falls in case of parallel run on build quantization extension. ### Tests https://github.com/openvinotoolkit/nncf/actions/runs/12835288745/job/35794407928?pr=3194
### Changes Add quantization_aware_training_tensorflow_mobilenet_v2 to test scope
…lkit#3207) ### Changes Changes the model from having the same input tensor to the QKV to having different input for each ### Reason for changes The former model was causing an error with openvino
…olkit#2727) ### Changes - Implemented OpenVINO model graphs which are used for calculation of compressed and decompressed weights. Since these models are compiled, calculation become significantly faster especially for larger models and int4 compression. - This functionality is exposed by two methods at `weight_lowering.py`: - `do_int_quantization()` is used for computing a compressed weight. Possible signatures: - `weight` -> `compressed_weight`, `scale`, (`zero_point` for asymmetric compression) - `weight`, `scale`, (`zero_point`) -> `compressed_weight`, `scale`, (`zero_point`) - `calculate_quantized_dequantized_weight()` is used for computing a decompressed weight. Possible signatures: - `weight` -> `decompressed_weight` - `weight`, `scale`, (`zero_point`) -> `decompressed_weight` - `weight` -> `decompressed_weight`, `compressed_weight`, `scale`, (`zero_point`) - `weight`, `scale`, (`zero_point`) -> `decompressed_weight`, `compressed_weight`, `scale`, (`zero_point`) - Output `scale` and `zero_point` are the same as the ones given as input (if they were given at all). - Computation is done via OV models only if openvino package is installed and input tensors are not torch tensors. - Introduce a new NNCF Tensor backend for storing instances of `openvino.Tensor`. Implementation for this backend is limited by only the required functionality, e.g. addition of OV Tensors is not supported because it is not needed. - Introduction of OV Tensors is required for seamless handling of tensors in `bf16`, `u4` and `i4` data types. For example, `bf16` constants are read from an OpenVINO LLM and given as inputs to a compressing OpenVINO model. `u4` and `i4` compressed weights are seamlessly inserted into the resulting compressed OpenVINO model. - Added `as_numpy_tensor()` method to convert an NNCF Tensor to numpy backend. Currently only OV -> NP conversion is required. - All calculations are aligned with reference numpy implementation. Some performance and memory sacrifices had to be made for such alignment. Data-free asymmetric compression:  Data-free symmetric compression:  Data-aware compression:  ### Reason for changes Reducing model compression time. Only OpenVINO model compression backend is affected. ### Related tickets 139047 ### Tests - `tests/openvino/native/quantization/test_ov_modeling_compression.py::test_quantization_alignment` -- check aligment with reference numpy implementation - `tests/openvino/native/test_openvino_modeling.py` -- checks OV modeling framework hyperparameters - `tests/openvino/native/test_tensor.py` -- NNCF OV Tensor backend tests Validation jobs: - `NNCF/job/manual/job/post_training_weight_compression/299/` - `NNCF/job/nightly/job/test_examples/650` - OVVP validation ✅ - optimum-intel test job https://github.com/huggingface/optimum-intel/actions/runs/12912964434/job/36009036879?pr=734
### Changes - Added `nf4` precision for OV `GraphConverter`. ### Reason for changes - `nf4` precision support. ### Related tickets - 153357 ### Tests - Added
### Changes Follow up to openvinotoolkit#2727 1. Do not use `infer_request.results` 2. Replace `>=` with `opset.greater_equal()` 3. Rename `ov_numeric.py` to `openvino_numeric.py` ### Reason for changes 1. Improve int4 compression time by up to ~10% 2. Avoid warning: `DeprecationWarning: greater_equal is deprecated and will be removed in version 2025.3. Use ops.greater_equal instead` 3. Fix onnx install test ### Related tickets 139047 ### Tests - https://github.com/openvinotoolkit/nncf/actions/runs/12947249537 - NNCF/job/manual/job/post_training_weight_compression/301/ - NNCF/job/nightly/job/test_examples/653/
### Changes - As stated in the title ### Reason for changes - Upcoming release ### Related tickets - 161230 ### Tests - N/A
### Changes - Skip `test_non_convertable_division` on MacOS - Cut values to 1e-10, more digits after zero dont change a result - Use onle line `pip install` in gha ### Reason for changes https://github.com/openvinotoolkit/nncf/actions/runs/12969549430/job/36208496733 ``` FAILED tests/openvino/native/test_node_utils.py::test_non_convertable_division[0.058599039912223816-15-True-0.003906603] - AssertionError: Not equal to tolerance rtol=0, atol=0 Mismatched elements: 1 / 1 (100%) Max absolute difference among violations: 3.5297126e-07 Max relative difference among violations: 9.035248e-05 ACTUAL: array([0.003906], dtype=float32) DESIRED: array([0.003907], dtype=float32) FAILED tests/openvino/native/test_node_utils.py::test_non_convertable_division[0.058599039912223816-15-False-0.003906602505594492] - AssertionError: Not equal to tolerance rtol=0, atol=0 Mismatched elements: 1 / 1 (100%) Max absolute difference among violations: 3.525056e-07 Max relative difference among violations: 9.023329e-05 ACTUAL: array([0.003906], dtype=float32) DESIRED: array([0.003907], dtype=float32) ``` ### Tests https://github.com/openvinotoolkit/nncf/actions/runs/12991342528/job/36228616650 --------- Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
5c12b78
to
ba6b7ba
Compare
ba6b7ba
to
389c796
Compare
### Changes * torch.ao `OpenVINOQuantizer` as well as `OpenVINOQuantizerAdapter` are introduced * `quantize_pt2e` function is updated to work with `OpenVINOQuantizer` ### Reason for changes * To enable OpenVINO quantization for torch.ao quantization pipelines (`torch.ao.quantization.prepare_pt2e`, `torch.ao.quantization.convert_pt2e`) and quantize_pt2e API function ### Related tickets openvinotoolkit#2766 ### Tests tests/torch/fx/test_quantizer.py is updated with use cases: - `OpenVINOQuantizer` + `quantize_pt2e` - `OpenVINOQuantizer` +`torch.ao.quantization.prepare_pt2e` -> `torch.ao.quantization.convert_pt2e`
### Changes - Introduced `PT2OpLayerAttribute`, to collect called function, attributes and constant ports - `FunctionMeta` stored function instead of function name ### Reason for changes Needs to implement subgraph extractor for FBC ### Related tickets 152996 ### Tests tests/torch2/function_hook/nncf_graph/test_layer_attributes.py
389c796
to
39c9f56
Compare
No grad during the TorchFX model validation quantization params are being forwarded to quantize_pt2e/OpenVINOQuantizer [Conformance] Ultralytics yolov8n and yolo11n XNNPACK conformance attempt ARM and QUALCOMM eager backends Shared quantization spec support [HWConfig] narrow_range parameter is introduced in hardware config Embedding qconfig list is extended for CPU devices / tests fixes References update
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
documentation
Improvements or additions to documentation
experimental
NNCF Common
NNCF ONNX
NNCF OpenVINO
NNCF PT
NNCF PTQ
NNCF TF
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Reason for changes
Related tickets
Tests