This example is used to demonstrate how to utilize Neural Compressor builtin dataloader and metric to enabling quantization for models defined in slim.
- Prepare
-
Prepare the FP32 model wget http://download.tensorflow.org/models/inception_v1_2016_08_28.tar.gz tar -xvf inception_v1_2016_08_28.tar.gz
-
Install dependencies pip install -r requirements.txt
- Config dataloader in conf.yaml The configuration will help user to create a dataloader of Imagenet and it will do Bilinear resampling to resize the image to 224x224. It also creates a TopK metric function for evaluation.
quantization: # optional. tuning constraints on model-wise for advance user to reduce tuning space.
calibration:
sampling_size: 20, 50 # optional. default value is 100. used to set how many samples should be used in calibration.
dataloader:
batch_size: 10
dataset:
ImageRecord:
root: /path/to/imagenet/ # NOTE: modify to calibration dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224
......
evaluation: # optional. required if user doesn't provide eval_func in Quantization.
accuracy: # optional. required if user doesn't provide eval_func in Quantization.
metric:
topk: 1 # built-in metrics are topk, map, f1, allow user to register new metric.
dataloader:
batch_size: 1
last_batch: discard
dataset:
ImageRecord:
root: /path/to/imagenet/ # NOTE: modify to evaluation dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224
- Run quantization
- Run Command
The cmd of quantization and predict with the quantized model
python test.py
- In order to do quantization for slim models, we need to get graph from slim .ckpt first.
from neural_compressor.experimental import Quantization, common
quantizer = Quantization('./conf.yaml')
# Do quantization
quantizer.model = common.Model('./inception_v1.ckpt')
quantized_model = quantizer()