Skip to content

Latest commit

 

History

History
executable file
·
20 lines (14 loc) · 1.06 KB

File metadata and controls

executable file
·
20 lines (14 loc) · 1.06 KB

Post-Training Quantization of PyTorch models with NNCF

This tutorial demonstrates how to use NNCF 8-bit quantization in post-training mode (without the fine-tuning pipeline) to optimize a PyTorch model for high-speed inference via OpenVINO Toolkit. For more advanced NNCF usage refer to these examples.

To make downloading and validating fast, we use an already pretrained ResNet-50 model on the Tiny ImageNet dataset.

It consists of the following steps:

  • Evaluate the original model
  • Transform the original FP32 model to INT8
  • Export optimized and original models to ONNX and then to OpenVINO IR
  • Compare performance of the obtained FP32 and INT8 models

Installation Instructions

If you have not done so already, please follow the Installation Guide to install all required dependencies.