This tutorial demonstrates how to apply INT8 quantization to the speech recognition model known as Wav2Vec2, using the Post-Training Optimization Tool API (POT API) (part of the OpenVINO Toolkit). We will use a fine-tuned Wav2Vec2-Base-960h PyTorch model trained on the LibriSpeech ASR corpus. The tutorial is designed to be extendable to custom models and datasets. It consists of the following steps:
- Download and prepare the Wav2Vec2 model and LibriSpeech dataset
- Define data loading and accuracy validation functionality
- Prepare the model for quantization
- Run optimization pipeline
- Compare performance of the original and quantized models