Skip to content

Latest commit

 

History

History
 
 

Step-by-Step

This document describes the step-by-step instructions for reproducing PyTorch ResNet50/ResNet18/ResNet101 tuning results with Intel® Neural Compressor.

Note

  • PyTorch eager mode quantization implementation requires to manually add QuantStub and DequantStub for quantizable ops, it also requires to manually do fusion operation.
  • Neural Compressor requires users to complete these two manual steps before triggering auto-tuning process. For details, please refer to https://pytorch.org/docs/stable/quantization.html

Prerequisite

1. Installation

pip install -r requirements.txt

2. Prepare Dataset

Download ImageNet Raw image to dir: /path/to/imagenet. The dir include below folder:

ls /path/to/imagenet
train  val

Examples Of Enabling Neural Compressor Auto Tuning On PyTorch ResNet

This is a tutorial of how to enable a PyTorch classification model with Intel® Neural Compressor.

User Code Analysis

For quantization aware training mode, Intel® Neural Compressor supports four usage as below:

  1. User specifies fp32 "model", training function "q_func", evaluation dataset "eval_dataloader" and metric in tuning.metric field of model-specific yaml config file, this option does not require customer to implement evaluation function.
  2. User specifies fp32 "model", training function "q_func" and a custom "eval_func" which encapsulates the evaluation dataset and metric by itself, this option require customer implement evaluation function by himself.
  3. User specifies fp32 "model", "calibration_dataloader", "eval_dataloader", and metric, optimizer, criterion in model-specific yaml config file. Neural Compressor will construct buildin training function and evaluation function this option.
  4. User specifies fp32 "model", "calibration_dataloader", a custom "eval_func", and optimizer, criterion in model-specific yaml config file. Neural Compressor will only construct buildin evaluation function this option.

As ResNet18/50/101 series are typical classification models, use Top-K as metric which is built-in supported by Intel® Neural Compressor. So here we integrate PyTorch ResNet with Intel® Neural Compressor by the first or third use cases for simplicity.

With buildin training function

Write Yaml Config File
model:                             
  name: imagenet_qat
  framework: pytorch                 

quantization:                                        # optional. required for QAT and PTQ.
  approach: quant_aware_training       
  train:
    end_epoch: 8
    optimizer:
      SGD:
        learning_rate: 0.0001
    criterion:
      CrossEntropyLoss:
        reduction: mean

evaluation:                              
  accuracy:                                
    metric:
      topk: 1                                

tuning:
  accuracy_criterion:
    relative:  0.01                            
  exit_policy:
    timeout: 0                                   
  random_seed: 9527

Here we choose built-in optimizer, criterion, metric and set accuracy target as tolerating 0.01 relative accuracy loss of baseline. The default tuning strategy is basic strategy. The timeout 0 means unlimited tuning time until accuracy target is met, but the result maybe is not a model of best accuracy and performance.

Prepare

PyTorch quantization requires two manual steps:

  1. Add QuantStub and DeQuantStub for all quantizable ops.
  2. Fuse possible patterns, such as Conv + Relu and Conv + BN + Relu.

Torchvision provide quantized_model, so we didn't do these steps above for all torchvision models. Please refer torchvision

The related code please refer to examples/pytorch/eager/image_recognition/imagenet/cpu/qat/main_buildin.py.

Code Update

After prepare step is done, we just need update main.py like below.

model.module.fuse_model()
from neural_compressor.experimental import Quantization, common
quantizer = Quantization(args.config)
quantizer.model = common.Model(model)
quantizer.calib_dataloader = train_loader
quantizer.eval_dataloader = val_loader
q_model = quantizer()
q_model.save(args.tuned_checkpoint)

The quantizer() function will return a best quantized model during timeout constrain.

Without buildin training function

Write Yaml Config File

In examples directory, there is a template.yaml. We could remove most of the items and only keep mandatory item for tuning.

#conf.yaml

model:
  name: imagenet_qat 
  framework: pytorch

quantization:
  approach: quant_aware_training

evaluation:
  accuracy:
    metric:
      topk: 1

tuning:
    accuracy_criterion:
      relative: 0.01
    exit_policy:
      timeout: 0
    random_seed: 9527

Here we choose topk built-in metric and set accuracy target as tolerating 0.01 relative accuracy loss of baseline. The default tuning strategy is basic strategy. The timeout 0 means unlimited tuning time until accuracy target is met, but the result maybe is not a model of best accuracy and performance.

Prepare

PyTorch quantization requires two manual steps:

  1. Add QuantStub and DeQuantStub for all quantizable ops.
  2. Fuse possible patterns, such as Conv + Relu and Conv + BN + Relu.

Torchvision provide quantized_model, so we didn't do these steps above for all torchvision models. Please refer torchvision

The related code please refer to examples/pytorch/eager/image_recognition/imagenet/cpu/qat/main.py.

Code Update

After prepare step is done, we just need update main.py like below.

def training_func_for_nc(model):
    epochs = 8
    optimizer = torch.optim.SGD(model.parameters(), lr=0.0001)
    prev_loss = 100
    loss_increase_times = 0
    patience = 2
    for nepoch in range(epochs):
        model.train()
        cnt = 0
        for image, target in train_loader:
            print('.', end='')
            cnt += 1
            output = model(image)
            loss = criterion(output, target)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
                    
            if cnt % 10 == 1 or cnt == len(train_loader) + 1:                        
                print('[{}/{}, {}/{}] Loss : {:.8}'.format(
                                nepoch+1, epochs, cnt, len(train_loader), loss.item()))

                   
                if cnt % 10 == 1 or cnt == len(train_loader) + 1:
                    _, curr_loss = validate(val_loader_earlystop, model, criterion, args)
                    print("The current val loss: ", curr_loss)
                    if curr_loss > prev_loss:
                        loss_increase_times += 1
                        print('No improvement times: ', loss_increase_times)
                    if loss_increase_times >= patience:
                        print("Early stopping")
                        return 

                    prev_loss = curr_loss

        if nepoch > 3:
            # Freeze quantizer parameters
            model.apply(torch.quantization.disable_observer)
        if nepoch > 2:
            # Freeze batch norm mean and variance estimates
            model.apply(torch.nn.intrinsic.qat.freeze_bn_stats)
    return
model.module.fuse_model()
from neural_compressor.experimental import Quantization, common
quantizer = Quantization("./conf.yaml")
quantizer.model = common.Model(model)
quantizer.q_func = training_func_for_nc
quantizer.eval_dataloader = val_loader
q_model = quantizer()

The quantizer() function will return a best quantized model during timeout constrain.

Run

1. ResNet50

with buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main_buildin.py -t -a resnet50 --pretrained --config ./conf_buildin.yaml /path/to/imagenet

without buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a resnet50 --pretrained --config /path/to/config_file /path/to/imagenet

For ResNet50 model, we can get int8 0.7614 accuracy vs fp32 0.7613.

2. ResNet18

with buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main_buildin.py -t -a resnet18 --pretrained --config ./conf_buildin.yaml /path/to/imagenet

without buildin training functionet

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a resnet18 --pretrained --config /path/to/config_file /path/to/imagenet

3. ResNext101_32x8d

with buildin training function

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main_buildin.py -t -a resnext101_32x8d --pretrained --config ./conf_buildin.yaml /path/to/imagenet

without buildin training functionet

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a resnext101_32x8d --pretrained --config /path/to/config_file /path/to/imagenet

4. MobileNetV2

without buildin training functionet

cd examples/pytorch/eager/image_recognition/imagenet/cpu/qat
python main.py -t -a mobilenet_v2 --pretrained --config /path/to/config_file /path/to/imagenet