Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
conf.yaml	conf.yaml
conf_buildin.yaml	conf_buildin.yaml
main.py	main.py
main_buildin.py	main_buildin.py
requirements.txt	requirements.txt
run_benchmark.sh	run_benchmark.sh
run_tuning.sh	run_tuning.sh

Step-by-Step

This document describes the step-by-step instructions for reproducing PyTorch ResNet50/ResNet18/ResNet101 tuning results with Intel® Neural Compressor.

Note

PyTorch eager mode quantization implementation requires to manually add QuantStub and DequantStub for quantizable ops, it also requires to manually do fusion operation.

Neural Compressor requires users to complete these two manual steps before triggering auto-tuning process. For details, please refer to https://pytorch.org/docs/stable/quantization.html

Prerequisite

1. Installation

pip install -r requirements.txt

2. Prepare Dataset

Download ImageNet Raw image to dir: /path/to/imagenet. The dir include below folder:

ls /path/to/imagenet
train  val

Examples Of Enabling Neural Compressor Auto Tuning On PyTorch ResNet

This is a tutorial of how to enable a PyTorch classification model with Intel® Neural Compressor.

User Code Analysis

For quantization aware training mode, Intel® Neural Compressor supports four usage as below:

User specifies fp32 "model", training function "q_func", evaluation dataset "eval_dataloader" and metric in tuning.metric field of model-specific yaml config file, this option does not require customer to implement evaluation function.
User specifies fp32 "model", training function "q_func" and a custom "eval_func" which encapsulates the evaluation dataset and metric by itself, this option require customer implement evaluation function by himself.
User specifies fp32 "model", "calibration_dataloader", "eval_dataloader", and metric, optimizer, criterion in model-specific yaml config file. Neural Compressor will construct buildin training function and evaluation function this option.
User specifies fp32 "model", "calibration_dataloader", a custom "eval_func", and optimizer, criterion in model-specific yaml config file. Neural Compressor will only construct buildin evaluation function this option.

As ResNet18/50/101 series are typical classification models, use Top-K as metric which is built-in supported by Intel® Neural Compressor. So here we integrate PyTorch ResNet with Intel® Neural Compressor by the first or third use cases for simplicity.

With buildin training function

Write Yaml Config File

model:                             
  name: imagenet_qat
  framework: pytorch                 

quantization:                                        # optional. required for QAT and PTQ.
  approach: quant_aware_training       
  train:
    end_epoch: 8
    optimizer:
      SGD:
        learning_rate: 0.0001
    criterion:
      CrossEntropyLoss:
        reduction: mean

evaluation:                              
  accuracy:                                
    metric:
      topk: 1                                

tuning:
  accuracy_criterion:
    relative:  0.01                            
  exit_policy:
    timeout: 0                                   
  random_seed: 9527

Here we choose built-in optimizer, criterion, metric and set accuracy target as tolerating 0.01 relative accuracy loss of baseline. The default tuning strategy is basic strategy. The timeout 0 means unlimited tuning time until accuracy target is met, but the result maybe is not a model of best accuracy and performance.

Prepare

PyTorch quantization requires two manual steps:

Add QuantStub and DeQuantStub for all quantizable ops.
Fuse possible patterns, such as Conv + Relu and Conv + BN + Relu.

Torchvision provide quantized_model, so we didn't do these steps above for all torchvision models. Please refer torchvision

The related code please refer to examples/pytorch/eager/image_recognition/imagenet/cpu/qat/main_buildin.py.

Code Update

After prepare step is done, we just need update main.py like below.

model.module.fuse_model()
from neural_compressor.experimental import Quantization, common
quantizer = Quantization(args.config)
quantizer.model = common.Model(model)
quantizer.calib_dataloader = train_loader
quantizer.eval_dataloader = val_loader
q_model = quantizer()
q_model.save(args.tuned_checkpoint)

The quantizer() function will return a best quantized model during timeout constrain.

Without buildin training function

Write Yaml Config File

In examples directory, there is a template.yaml. We could remove most of the items and only keep mandatory item for tuning.

#conf.yaml

model:
  name: imagenet_qat 
  framework: pytorch

quantization:
  approach: quant_aware_training

evaluation:
  accuracy:
    metric:
      topk: 1

tuning:
    accuracy_criterion:
      relative: 0.01
    exit_policy:
      timeout: 0
    random_seed: 9527

Here we choose topk built-in metric and set accuracy target as tolerating 0.01 relative accuracy loss of baseline. The default tuning strategy is basic strategy. The timeout 0 means unlimited tuning time until accuracy target is met, but the result maybe is not a model of best accuracy and performance.

Prepare

PyTorch quantization requires two manual steps:

Add QuantStub and DeQuantStub for all quantizable ops.
Fuse possible patterns, such as Conv + Relu and Conv + BN + Relu.

Torchvision provide quantized_model, so we didn't do these steps above for all torchvision models. Please refer torchvision

The related code please refer to examples/pytorch/eager/image_recognition/imagenet/cpu/qat/main.py.

Code Update

After prepare step is done, we just need update main.py like below.

def training_func_for_nc(model):
    epochs = 8
    optimizer = torch.optim.SGD(model.parameters(), lr=0.0001)
    prev_loss = 100
    loss_increase_times = 0
    patience = 2
    for nepoch in range(epochs):
        model.train()
        cnt = 0
        for image, target in train_loader:
            print('.', end='')
            cnt += 1
            output = model(image)
            loss = criterion(output, target)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
                    
            if cnt % 10 == 1 or cnt == len(train_loader) + 1:                        
                print('[{}/{}, {}/{}] Loss : {:.8}'.format(
                                nepoch+1, epochs, cnt, len(train_loader), loss.item()))

                   
                if cnt % 10 == 1 or cnt == len(train_loader) + 1:
                    _, curr_loss = validate(val_loader_earlystop, model, criterion, args)
                    print("The current val loss: ", curr_loss)
                    if curr_loss > prev_loss:
                        loss_increase_times += 1
                        print('No improvement times: ', loss_increase_times)
                    if loss_increase_times >= patience:
                        print("Early stopping")
                        return 

                    prev_loss = curr_loss

        if nepoch > 3:
            # Freeze quantizer parameters
            model.apply(torch.quantization.disable_observer)
        if nepoch > 2:
            # Freeze batch norm mean and variance estimates
            model.apply(torch.nn.intrinsic.qat.freeze_bn_stats)
    return
model.module.fuse_model()
from neural_compressor.experimental import Quantization, common
quantizer = Quantization("./conf.yaml")
quantizer.model = common.Model(model)
quantizer.q_func = training_func_for_nc
quantizer.eval_dataloader = val_loader
q_model = quantizer()

The quantizer() function will return a best quantized model during timeout constrain.

Run

1. ResNet50

with buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main_buildin.py -t -a resnet50 --pretrained --config ./conf_buildin.yaml /path/to/imagenet

without buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a resnet50 --pretrained --config /path/to/config_file /path/to/imagenet

For ResNet50 model, we can get int8 0.7614 accuracy vs fp32 0.7613.

2. ResNet18

with buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main_buildin.py -t -a resnet18 --pretrained --config ./conf_buildin.yaml /path/to/imagenet

without buildin training functionet

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a resnet18 --pretrained --config /path/to/config_file /path/to/imagenet

3. ResNext101_32x8d

with buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main_buildin.py -t -a resnext101_32x8d --pretrained --config ./conf_buildin.yaml /path/to/imagenet

without buildin training functionet

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a resnext101_32x8d --pretrained --config /path/to/config_file /path/to/imagenet

4. MobileNetV2

without buildin training functionet

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a mobilenet_v2 --pretrained --config /path/to/config_file /path/to/imagenet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qat

qat

README.md

Step-by-Step

Prerequisite

1. Installation

2. Prepare Dataset

Examples Of Enabling Neural Compressor Auto Tuning On PyTorch ResNet

User Code Analysis

With buildin training function

Write Yaml Config File

Prepare

Code Update

Without buildin training function

Write Yaml Config File

Prepare

Code Update

Run

1. ResNet50

2. ResNet18

3. ResNext101_32x8d

4. MobileNetV2

Files

qat

Directory actions

More options

Directory actions

More options

Latest commit

History

qat

Folders and files

parent directory

README.md

Step-by-Step

Prerequisite

1. Installation

2. Prepare Dataset

Examples Of Enabling Neural Compressor Auto Tuning On PyTorch ResNet

User Code Analysis

With buildin training function

Write Yaml Config File

Prepare

Code Update

Without buildin training function

Write Yaml Config File

Prepare

Code Update

Run

1. ResNet50

2. ResNet18

3. ResNext101_32x8d

4. MobileNetV2