[MinMax] Embedding nodes as input nodes for inference graph #3320

daniil-lyakhov · 2025-02-28T10:34:31Z

Reopen of the #2862

Changes

Embedding nodes are used as input nodes for the inference graph (with that embedding nodes are being included to the inference_nncf_graph)
inference_nncf_graph is used to identify weighted nodes
PT/FX MinMax get_weight_nodes method is updated to work with the inference graph
Constant folding is removed from the OpenVINOQuantizer and FX nncf.quantize implementation

Reason for changes

To prevent quantization of the constant branches without constant folding (Openvino backend for Executorch to enable inference on Intel CPUs, GPUs, NPUs pytorch/executorch#8573 (comment))

Related tickets

163025

Tests

tests/cross_fw/test_templates/test_quantizer_config.py is updated with shape_of /constant embedding model and conv model with constant branches
TorchFX reference graphs for VIT and Swin were updated: constant branches are present in the quantized graph but they don't have quantizers inside
conformance test post_training_quantization/625/ - passed

…Quantizer

nncf/quantization/algorithms/min_max/backend.py

AlexanderDokuchaev · 2025-03-05T12:40:10Z

nncf/quantization/algorithms/min_max/torch_backend.py

@@ -50,6 +51,7 @@
 from nncf.torch.quantization.layers import BaseQuantizer
 from nncf.torch.quantization.layers import PTQuantizerSpec
 from nncf.torch.quantization.layers import get_scale_shape
+from nncf.torch.utils import get_weight_nodes_in_inference_grpah


Suggested change

from nncf.torch.utils import get_weight_nodes_in_inference_grpah

from nncf.torch.utils import get_weight_nodes_in_inference_graph

nncf/torch/utils.py

AlexanderDokuchaev · 2025-03-05T12:44:40Z

nncf/torch/utils.py

@@ -467,3 +470,46 @@ def get_model_dtype(model: torch.nn.Module) -> torch.dtype:
        # The model had no parameters at all, assume FP32
        dtype = torch.float32
    return dtype
+
+
+def get_weight_nodes_in_inference_grpah(


Suggested change

def get_weight_nodes_in_inference_grpah(

def get_weight_nodes_in_inference_graph(

AlexanderDokuchaev · 2025-03-05T12:49:31Z

nncf/experimental/torch/fx/quantization/quantize_model.py

@@ -86,7 +86,6 @@ def quantize_impl(
        advanced_parameters=advanced_parameters,
    )

-    # To make it easier for bias correction algorithms.


Why was this comment removed?

nncf/torch/utils.py

AlexanderDokuchaev · 2025-03-05T13:00:44Z

nncf/torch/utils.py

+
+    # Inference graph does not containt constans, so
+    # any missed input edge means it is a constant branch.
+    return node.metatype in [om.PTMatMulMetatype, om.PTAddmmMetatype] and len(


Please use variables to make it more readable

AlexanderDokuchaev · 2025-03-05T13:06:01Z

nncf/torch/utils.py

+    # any missed input edge means it is a constant branch.
+    return node.metatype in [om.PTMatMulMetatype, om.PTAddmmMetatype] and len(
+        inference_nncf_graph.get_input_edges(node)
+    ) < len(node.metatype.weight_port_ids)


is it work for possible missed inputs, that determinate in get_nodes_with_missed_input_edges?

AlexanderDokuchaev · 2025-03-05T13:30:41Z

nncf/torch/utils.py

+
+
+def get_weight_nodes_in_inference_grpah(
+    inference_nncf_graph: NNCFGraph, mat_mul_metatypes: List[om.PTOperatorMetatype]


Looks like mat_mul_metatypes expected always same value, mat_mul_metatypes should not be used as argument instead use reusable constant variable

Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>

daniil-lyakhov added 2 commits February 27, 2025 19:52

[MinMax] Embedding nodes as input nodes for inference graph

116e64a

Fix get_weight_nodes to work with inference graph

d400543

github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX NNCF PTQ Pull requests that updates NNCF PTQ experimental labels Feb 28, 2025

[TorchFX] Constant folding is removed from nncf.quantize and OpenVINO…

0a2f240

…Quantizer

daniil-lyakhov force-pushed the dl/shape_of_sub_emb_fix branch from 9689dca to 0a2f240 Compare February 28, 2025 11:36

daniil-lyakhov requested review from nikita-malininn and AlexanderDokuchaev February 28, 2025 13:28

daniil-lyakhov marked this pull request as ready for review February 28, 2025 13:28

daniil-lyakhov requested a review from a team as a code owner February 28, 2025 13:28

AlexanderDokuchaev requested changes Mar 5, 2025

View reviewed changes

alexsu52 assigned AlexanderDokuchaev Mar 5, 2025

Apply suggestions from code review

a86aeaf

Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>

github-actions bot added the API Public API-impacting changes label Mar 17, 2025

daniil-lyakhov added 2 commits March 19, 2025 10:35

Comments

a038faa

Use both nncf_graph and inference_nncf_graph

1182c44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MinMax] Embedding nodes as input nodes for inference graph #3320

[MinMax] Embedding nodes as input nodes for inference graph #3320

daniil-lyakhov commented Feb 28, 2025 •

edited

Loading

AlexanderDokuchaev Mar 5, 2025

AlexanderDokuchaev Mar 5, 2025

AlexanderDokuchaev Mar 5, 2025

AlexanderDokuchaev Mar 5, 2025

AlexanderDokuchaev Mar 5, 2025

AlexanderDokuchaev Mar 5, 2025

	from nncf.torch.utils import get_weight_nodes_in_inference_grpah
	from nncf.torch.utils import get_weight_nodes_in_inference_graph

	def get_weight_nodes_in_inference_grpah(
	def get_weight_nodes_in_inference_graph(



		def get_weight_nodes_in_inference_grpah(
		inference_nncf_graph: NNCFGraph, mat_mul_metatypes: List[om.PTOperatorMetatype]

[MinMax] Embedding nodes as input nodes for inference graph #3320

Are you sure you want to change the base?

[MinMax] Embedding nodes as input nodes for inference graph #3320

Conversation

daniil-lyakhov commented Feb 28, 2025 • edited Loading

Changes

Reason for changes

Related tickets

Tests

AlexanderDokuchaev Mar 5, 2025

Choose a reason for hiding this comment

AlexanderDokuchaev Mar 5, 2025

Choose a reason for hiding this comment

AlexanderDokuchaev Mar 5, 2025

Choose a reason for hiding this comment

AlexanderDokuchaev Mar 5, 2025

Choose a reason for hiding this comment

AlexanderDokuchaev Mar 5, 2025

Choose a reason for hiding this comment

AlexanderDokuchaev Mar 5, 2025

Choose a reason for hiding this comment

daniil-lyakhov commented Feb 28, 2025 •

edited

Loading