Post-Training Quantization of PyTorch models with NNCF

This tutorial demonstrates how to use NNCF 8-bit quantization in post-training mode (without the fine-tuning pipeline) to optimize a PyTorch model for high-speed inference via OpenVINO Toolkit. For more advanced NNCF usage refer to these examples.

To make downloading and validating fast, we use an already pretrained ResNet-50 model on the Tiny ImageNet dataset.

It consists of the following steps:

Evaluate the original model
Transform the original FP32 model to INT8
Export optimized and original models to ONNX and then to OpenVINO IR
Compare performance of the obtained FP32 and INT8 models

Installation Instructions

If you have not done so already, please follow the Installation Guide to install all required dependencies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Post-Training Quantization of PyTorch models with NNCF

Installation Instructions

Files

README.md

Latest commit

History

README.md

File metadata and controls

Post-Training Quantization of PyTorch models with NNCF

Installation Instructions