Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition to a newer NNCF API for PyTorch model quantization #630

Merged
merged 3 commits into from
Mar 26, 2024

Conversation

nikita-savelyevv
Copy link
Collaborator

Post-Training Quantization for PyTorch backend with nncf.create_compressed_model() API is obsolete and should be replaced with nncf.quantize() call which is used for OV backend.

What does this PR do?

Replace nncf.create_compressed_model() call with nncf.quantize() call for quantization of PyTorch models.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@nikita-savelyevv nikita-savelyevv changed the title [NNCF] Replace create_compressed_model call with quantize call for PyTorch backend Transition to a newer NNCF API for PyTorch model quantization Mar 22, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review March 25, 2024 08:12
@nikita-savelyevv
Copy link
Collaborator Author

nikita-savelyevv commented Mar 25, 2024

@AlexKoff88 @alexsu52 could you please review this PR?

@AlexKoff88 AlexKoff88 requested a review from echarlaix March 25, 2024 10:10
@AlexKoff88
Copy link
Collaborator

@echarlaix, as we agreed, this is the first PR in the series of changes to deprecate legacy NNCF functions and move all the actual optimization functionality under OVQuantizer API. Please take a look.

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks @nikita-savelyevv

@@ -360,7 +358,7 @@ def _quantize_torchmodel(
logger.info(
"No configuration describing the quantization process was provided, a default OVConfig will be generated."
)
ov_config = OVConfig(compression=DEFAULT_QUANTIZATION_CONFIG)
ov_config = OVConfig()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used for quantization (but now only for the save_onnx_model parameter) so not sure we need to create an instance when not provided, we can only give save_onnx_model a default value instead. Also no need to save the configuration after the quantization + export steps for the same reason : would remove saving the config here

ov_config.save_pretrained(save_directory)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the suggested changes, also removed ov_config from list of arguments altogether. In the future will bring it back to pass quantization parameters through it.

@echarlaix echarlaix merged commit a3bf172 into huggingface:main Mar 26, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants