Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FX][Conformance] Enable Conformance Test for FX Backend #3321

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 0 additions & 32 deletions tests/post_training/data/ptq_reference_data.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,16 +42,8 @@ torchvision/resnet18_backend_CUDA_TORCH:
metric_value: 0.69152
torchvision/resnet18_backend_FX_TORCH:
metric_value: 0.6946
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted. Failed operations detailed log:\n-- torch::None with a message:\nNone constant cannot be converted to OpenVINO opset and should be removed by consuming operation.\nSummary:\n-- No conversion rule found for operations: aten.adaptive_avg_pool2d.default, aten.conv2d.default, aten.linear.default, aten.max_pool2d.default\n-- Conversion is failed for: torch::None\n"
message: "Issue-162009"
torchvision/resnet18_backend_CUDA_FX_TORCH:
metric_value: 0.6946
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted. Failed operations detailed log:\n-- torch::None with a message:\nNone constant cannot be converted to OpenVINO opset and should be removed by consuming operation.\nSummary:\n-- No conversion rule found for operations: aten.adaptive_avg_pool2d.default, aten.conv2d.default, aten.linear.default, aten.max_pool2d.default\n-- Conversion is failed for: torch::None\n"
message: "Issue-162009"
torchvision/mobilenet_v3_small_BC_backend_FP32:
metric_value: 0.6766
torchvision/mobilenet_v3_small_BC_backend_OV:
Expand All @@ -60,48 +52,24 @@ torchvision/mobilenet_v3_small_BC_backend_ONNX:
metric_value: 0.6679
torchvision/mobilenet_v3_small_BC_backend_FX_TORCH:
metric_value: 0.6679
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted. Failed operations detailed log:\n-- torch::None with a message:\nNone constant cannot be converted to OpenVINO opset and should be removed by consuming operation.\nSummary:\n-- No conversion rule found for operations: aten.adaptive_avg_pool2d.default, aten.conv2d.default, aten.linear.default\n-- Conversion is failed for: torch::None\n"
message: "Issue-162009"
torchvision/mobilenet_v3_small_BC_backend_CUDA_FX_TORCH:
metric_value: 0.6664
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted. Failed operations detailed log:\n-- torch::None with a message:\nNone constant cannot be converted to OpenVINO opset and should be removed by consuming operation.\nSummary:\n-- No conversion rule found for operations: aten.adaptive_avg_pool2d.default, aten.conv2d.default, aten.linear.default\n-- Conversion is failed for: torch::None\n"
message: "Issue-162009"
torchvision/vit_b_16_backend_FP32:
metric_value: 0.8107
torchvision/vit_b_16_backend_OV:
metric_value: 0.80948
torchvision/vit_b_16_backend_FX_TORCH:
metric_value: 0.80922
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted.\nSummary:\n-- No conversion rule found for operations: aten.conv2d.default, aten.layer_norm.default, aten.linear.default, aten.scaled_dot_product_attention.default\n"
message: "Issue-162009"
torchvision/vit_b_16_backend_CUDA_FX_TORCH:
metric_value: 0.80922
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted.\nSummary:\n-- No conversion rule found for operations: aten.conv2d.default, aten.layer_norm.default, aten.linear.default, aten.scaled_dot_product_attention.default\n"
message: "Issue-162009"
torchvision/swin_v2_s_backend_FP32:
metric_value: 0.83712
torchvision/swin_v2_s_backend_OV:
metric_value: 0.83638
torchvision/swin_v2_s_backend_FX_TORCH:
metric_value: 0.8360
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted.\nSummary:\n-- No conversion rule found for operations: aten.adaptive_avg_pool2d.default, aten.conv2d.default, aten.layer_norm.default, aten.linear.default, aten.matmul.default, aten.pad.default, aten.softmax.int, aten.where.ScalarSelf\n"
message: "Issue-162009"
torchvision/swin_v2_s_backend_CUDA_FX_TORCH:
metric_value: 0.8360
exception_xfail_reason:
type: "OpConversionFailure"
error_message: "Check 'is_conversion_successful' failed at src/frontends/pytorch/src/frontend.cpp:171:\nFrontEnd API failed with OpConversionFailure:\nModel wasn't fully converted.\nSummary:\n-- No conversion rule found for operations: aten.adaptive_avg_pool2d.default, aten.conv2d.default, aten.layer_norm.default, aten.linear.default, aten.matmul.default, aten.pad.default, aten.softmax.int, aten.where.ScalarSelf\n"
message: "Issue-162009"
timm/crossvit_9_240_backend_CUDA_TORCH:
metric_value: 0.7275
timm/crossvit_9_240_backend_FP32:
Expand Down
31 changes: 28 additions & 3 deletions tests/post_training/pipelines/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -485,6 +485,25 @@ def compress(self) -> None:
self.run_info.compression_memory_usage = memory_usage(self._compress, max_usage=True)
self.run_info.time_compression = time.perf_counter() - start_time

def _rename_files(self, folder_path, new_name):
model_folder = folder_path / "model"
bin_file = None
xml_file = None
for file in os.listdir(model_folder):
if file.endswith(".bin"):
bin_file = file
elif file.endswith(".xml"):
xml_file = file
Comment on lines +493 to +496
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are several submodels in this dir?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submodels of the same model? or other models?
I can save in a different folder by the model name if the problem is the latter like this:
torch.compile(exported_model.module(), backend="openvino", options = {"model_caching" : True, "cache_dir": str(self.output_model_dir / self.model_name)})
instead of just "cache_dir": str(self.output_model_dir)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean one model is being cut on several part, as it was with the Yolo 11 model. As far as I remember, this means there are several IRs generated for one model which are run sequentially

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, but the models with graph break should not be supported right?

Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should have one graph, but it is possible due to bugs in ov/nncf that there are several IRs. I wonder, what will be the result? Perhaps we shouldn't analyze the parts of the model separately

Copy link
Collaborator Author

@anzr299 anzr299 Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then Maybe I can raise an error after a check for multiple bin and xml files in the location. The expected behavior would be to simple rename and replace the files.

if bin_file is None or xml_file is None:
return
bin_new_path = folder_path / f'{new_name}.bin'
xml_new_path = folder_path / f'{new_name}.xml'

os.rename(os.path.join(model_folder, bin_file), bin_new_path)
os.rename(os.path.join(model_folder, xml_file), xml_new_path)

os.rmdir(model_folder)

def save_compressed_model(self) -> None:
"""
Save compressed model to IR.
Expand All @@ -500,9 +519,15 @@ def save_compressed_model(self) -> None:
ov.serialize(ov_model, self.path_compressed_ir)
elif self.backend in FX_BACKENDS:
exported_model = torch.export.export(self.compressed_model.cpu(), (self.dummy_tensor.cpu(),))
ov_model = ov.convert_model(exported_model, example_input=self.dummy_tensor.cpu(), input=self.input_size)
ov_model.reshape(self.input_size)
ov.serialize(ov_model, self.path_compressed_ir)
# TODO Uncomment these lines after Issue - 162009
# ov_model = ov.convert_model(exported_model, example_input=self.dummy_tensor.cpu(), input=self.input_size)
# ov_model.reshape(self.input_size)
# ov.serialize(ov_model, self.path_compressed_ir)
# TODO Remove after Issue - 162009
torch.export.save(exported_model, self.output_model_dir / "model.pt2")
mod = torch.compile(exported_model.module(), backend="openvino", options = {"model_caching" : True, "cache_dir": str(self.output_model_dir)})
mod(self.dummy_tensor)
self._rename_files(self.output_model_dir, 'model')

if self.backend == BackendType.CUDA_FX_TORCH:
self.model = self.model.cuda()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def _validate(self) -> None:
predictions = np.zeros(dataset_size)
references = -1 * np.ones(dataset_size)

if self.backend in FX_BACKENDS and self.torch_compile_validation:
Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not remove the self.torch_compile_validation option and made it True by default

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then the default path for FX backend models will be using OV validation

Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, my bad, I meant True by default

if self.backend in FX_BACKENDS:
predictions, references = self._validate_torch_compile(val_loader, predictions, references)
else:
predictions, references = self._validate_ov(val_loader, predictions, references, dataset_size)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -130,8 +130,11 @@ def _dump_model_fp32(self) -> None:

if self.backend in FX_BACKENDS:
exported_model = torch.export.export(self.model.cpu(), (self.dummy_tensor.cpu(),))
ov_model = ov.convert_model(exported_model, example_input=self.dummy_tensor, input=self.input_size)
ov.serialize(ov_model, self.fp32_model_dir / "fx_model_fp32.xml")
# TODO Uncomment these lines after Issue - 162009
# ov_model = ov.convert_model(exported_model, example_input=self.dummy_tensor, input=self.input_size)
# ov.serialize(ov_model, self.fp32_model_dir / "fx_model_fp32.xml")
# TODO Remove after Issue - 162009
torch.export.save(exported_model, self.fp32_model_dir / "fx_model_fp32.pt2")

if self.backend is BackendType.CUDA_FX_TORCH:
self.model = self.model.cuda()
Expand Down
Loading