Skip to content

Latest commit

 

History

History
1016 lines (996 loc) · 28.4 KB

full_model_list.md

File metadata and controls

1016 lines (996 loc) · 28.4 KB

Full Validated Models

The below tables are models enabled by the Intel® Low Precision Optimization Tool.

TensorFlow 2.x models

Framework Version Model Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
   
tensorflow   
   
2.4.0   
   
resnet50v1.0   
   
74.21%   
   
74.27%   
   
-0.08%   
   
3.44x   
   
tensorflow   
   
2.4.0   
   
resnet50v1.5   
   
76.92%   
   
76.46%   
   
0.60%   
   
3.37x   
   
tensorflow   
   
2.4.0   
   
resnet101   
   
77.18%   
   
76.45%   
   
0.95%   
   
2.53x   
   
tensorflow   
   
2.4.0   
   
inception_v1   
   
70.41%   
   
69.74%   
   
0.96%   
   
1.89x   
   
tensorflow   
   
2.4.0   
   
inception_v2   
   
74.36%   
   
73.97%   
   
0.53%   
   
1.95x   
   
tensorflow   
   
2.4.0   
   
inception_v3   
   
77.28%   
   
76.75%   
   
0.69%   
   
2.37x   
   
tensorflow   
   
2.4.0   
   
inception_v4   
   
80.39%   
   
80.27%   
   
0.15%   
   
2.60x   
   
tensorflow   
   
2.4.0   
   
inception_resnet_v2   
   
80.38%   
   
80.40%   
   
-0.02%   
   
1.98x   
   
tensorflow   
   
2.4.0   
   
mobilenetv1   
   
73.29%   
   
70.96%   
   
3.28%   
   
2.93x   
   
tensorflow   
   
2.4.0   
   
mobilenetv2   
   
71.98%   
   
71.76%   
   
0.31%   
   
1.78x   
   
tensorflow   
   
2.4.0   
   
ssd_resnet50_v1   
   
37.98%   
   
38.00%   
   
-0.05%   
   
2.99x   
   
tensorflow   
   
2.4.0   
   
mask_rcnn_inception_v2   
   
28.62%   
   
28.73%   
   
-0.38%   
   
2.96x   
   
tensorflow   
   
2.4.0   
   
wide_deep_large_ds   
   
77.61%   
   
77.67%   
   
-0.08%   
   
1.50x   
   
tensorflow   
   
2.4.0   
   
vgg16   
   
72.11%   
   
70.89%   
   
1.72%   
   
3.76x   
   
tensorflow   
   
2.4.0   
   
vgg19   
   
72.36%   
   
71.01%   
   
1.90%   
   
3.85x   
   
tensorflow   
   
2.4.0   
   
resnetv2_50   
   
70.39%   
   
69.64%   
   
1.08%   
   
1.40x   
   
tensorflow   
   
2.4.0   
   
resnetv2_101   
   
72.58%   
   
71.87%   
   
0.99%   
   
1.51x   
   
tensorflow   
   
2.4.0   
   
resnetv2_152   
   
72.92%   
   
72.37%   
   
0.76%   
   
1.48x   
   
tensorflow   
   
2.4.0   
   
densenet121   
   
72.79%   
   
72.89%   
   
-0.14%   
   
1.58x   
   
tensorflow   
   
2.4.0   
   
densenet161   
   
76.41%   
   
76.29%   
   
0.16%   
   
1.79x   
   
tensorflow   
   
2.4.0   
   
densenet169   
   
74.55%   
   
74.65%   
   
-0.13%   
   
1.49x   
   
tensorflow   
   
2.4.0   
   
efficientnet_b0   
   
78.40%   
   
76.75%   
   
2.15%   
   
1.13x   
   
tensorflow   
   
2.4.0   
   
deeplab   
   
81.96%   
   
82.20%   
   
-0.29%   
   
1.55x   

TensorFlow 1.x models

Framework Version Model Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
   
tensorflow   
   
1.15up2   
   
resnet_v1_50_slim   
   
76.06%   
   
75.18%   
   
1.17%   
   
2.94x   
   
tensorflow   
   
1.15up2   
   
resnet_v1_101_slim   
   
77.19%   
   
76.40%   
   
1.03%   
   
3.39x   
   
tensorflow   
   
1.15up2   
   
resnet_v1_152_slim   
   
77.58%   
   
76.81%   
   
1.00%   
   
3.74x   
   
tensorflow   
   
1.15up2   
   
inception_v1_slim   
   
70.44%   
   
69.77%   
   
0.96%   
   
1.95x   
   
tensorflow   
   
1.15up2   
   
inception_v2_slim   
   
74.32%   
   
73.98%   
   
0.46%   
   
2.00x   
   
tensorflow   
   
1.15up2   
   
inception_v3_slim   
   
78.30%   
   
77.99%   
   
0.40%   
   
2.57x   
   
tensorflow   
   
1.15up2   
   
inception_v4_slim   
   
80.30%   
   
80.19%   
   
0.14%   
   
2.88x   
   
tensorflow   
   
1.15up2   
   
vgg16_slim   
   
72.16%   
   
70.89%   
   
1.79%   
   
3.81x   
   
tensorflow   
   
1.15up2   
   
vgg19_slim   
   
72.29%   
   
71.01%   
   
1.80%   
   
3.88x   
   
tensorflow   
   
1.15up2   
   
resnetv2_50_slim   
   
70.35%   
   
69.72%   
   
0.90%   
   
1.41x   
   
tensorflow   
   
1.15up2   
   
resnetv2_101_slim   
   
72.49%   
   
71.91%   
   
0.81%   
   
1.54x   
   
tensorflow   
   
1.15up2   
   
resnetv2_152_slim   
   
72.90%   
   
72.40%   
   
0.69%   
   
1.60x   
   
tensorflow   
   
1.15up2   
   
bert_large_squad   
   
92.35   
   
92.98   
   
-0.68%   
   
2.81x   
   
tensorflow   
   
1.15up2   
   
bert_base_mrpc   
   
85.78%   
   
86.52%   
   
-0.86%   
   
1.50x   

PyTorch models

Framework Version Model Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
pytorch 1.5.0+cpu resnet18 69.60% 69.76% -0.22% 2.00x
pytorch 1.5.0+cpu resnet50 75.96% 76.13% -0.23% 2.46x
pytorch 1.5.0+cpu resnext101_32x8d 79.12% 79.31% -0.24% 2.63x
pytorch 1.5.0+cpu inception_v3 69.42% 69.54% -0.17% 1.96x
pytorch 1.5.0+cpu peleenet 71.59% 72.08% -0.68% 1.43x
pytorch 1.5.0+cpu yolo_v3 24.42% 24.54% -0.51% 1.74x
pytorch 1.5.0+cpu se_resnext50_32x4d 79.04% 79.08% -0.05% 1.87x
pytorch 1.5.0+cpu mobilenet_v2 70.63% 71.86% -1.70% 1.75x
pytorch 1.5.0+cpu 3dunet 85.31% 85.30% 0.01% 1.84x
pytorch 1.5.0+cpu distilbert_base_mrpc 81.34% 80.99% 0.43% 1.78x
pytorch 1.5.0+cpu albert_base_mrpc 88.34% 88.50% -0.18% 1.48x
pytorch 1.5.0+cpu funnel_mrpc 91.79% 92.26% -0.51% 1.40x
pytorch 1.5.0+cpu mbart_wnli 56.34% 56.34% 0.00% 2.06x
pytorch 1.5.0+cpu t5_wmt_en_ro 24.36 24.52 -0.66% 1.60x
pytorch 1.5.0+cpu marianmt_wmt_en_ro 22.33 22.23 0.46% 2.11x
pytorch 1.5.0+cpu pegasus_billsum 51.1 51.21 -0.23% 2.00x
pytorch 1.5.0+cpu rnnt 91.55 92.55 -1.08% 3.10x
pytorch 1.5.0+cpu reformer_crime_and_punishment 6.55 6.5 0.79% 1.07x
pytorch 1.5.0+cpu xlm-roberta-base_mrpc 87.93% 88.62% -0.78% 1.34x
pytorch 1.5.0+cpu flaubert_mrpc 80.69% 80.19% 0.62% 1.43x
pytorch 1.5.0+cpu barthez_mrpc 83.23% 83.81% -0.68% 1.81x
pytorch 1.5.0+cpu longformer_mrpc 90.65% 91.46% -0.88% 1.30x
pytorch 1.5.0+cpu layoutlm_mrpc 81.22% 78.01% 4.12% 1.86x
pytorch 1.5.0+cpu deberta_mrpc 91.07% 90.91% 0.17% 1.46x
pytorch 1.5.0+cpu dlrm_fx 80.19% 80.27% -0.10% 1.25x
pytorch 1.5.0+cpu resnet18_fx 69.61% 69.76% -0.22% 2.17x
pytorch 1.5.0+cpu xlm_roberta_mrpc 88.47% 88.24% 0.27% 1.76x
pytorch 1.5.0+cpu xlnet_base_mrpc 89.62% 89.47% 0.17% 1.27x
pytorch 1.5.0+cpu transfo_xl_mrpc 81.74% 81.20% 0.66% 1.38x
pytorch 1.5.0+cpu ctrl_mrpc 81.76% 82.00% -0.29% 2.49x
pytorch 1.6.0a0+24aac32 bert_base_mrpc 88.90% 88.73% 0.19% 2.10x
pytorch 1.6.0a0+24aac32 bert_base_cola 59.06% 58.84% 0.37% 2.23x
pytorch 1.6.0a0+24aac32 bert_base_sts-b 88.40% 89.27% -0.97% 2.13x
pytorch 1.6.0a0+24aac32 bert_base_sst-2 91.51% 91.86% -0.37% 2.32x
pytorch 1.6.0a0+24aac32 bert_base_rte 69.31% 69.68% -0.52% 2.03x
pytorch 1.6.0a0+24aac32 bert_large_mrpc 87.45% 88.33% -0.99% 2.65x
pytorch 1.6.0a0+24aac32 bert_large_squad 92.85 93.05 -0.21% 1.92x
pytorch 1.6.0a0+24aac32 bert_large_qnli 91.20% 91.82% -0.68% 2.59x
pytorch 1.6.0a0+24aac32 bert_large_rte 71.84% 72.56% -0.99% 1.34x
pytorch 1.6.0a0+24aac32 bert_large_cola 62.74% 62.57% 0.27% 2.67x
pytorch 1.6.0a0+24aac32 gpt_wikitext 60.06 60.2 -0.23% 1.15x
pytorch 1.6.0a0+24aac32 roberta_base_mrpc 85.08% 85.51% -0.51% 2.12x
pytorch 1.6.0a0+24aac32 camembert_base_mrpc 83.57% 84.22% -0.77% 2.18x

Quantization-aware training models

Framework Version Model Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
pytorch 1.5.0+cpu resnet18_qat 69.76% 69.76% 0.01% 2.05x
pytorch 1.5.0+cpu resnet50_qat 76.37% 76.13% 0.32% 2.56x

MXNet models

Framework Version Model Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
mxnet 1.7.0 resnet50v1 76.03% 76.33% -0.39% 3.23x
mxnet 1.7.0 inceptionv3 77.80% 77.64% 0.21% 2.73x
mxnet 1.7.0 mobilenet1.0 71.71% 72.22% -0.71% 2.51x
mxnet 1.7.0 mobilenetv2_1.0 70.77% 70.87% -0.14% 2.63x
mxnet 1.7.0 resnet18_v1 70.00% 70.14% -0.21% 3.12x
mxnet 1.7.0 squeezenet1.0 56.89% 56.96% -0.13% 2.61x
mxnet 1.7.0 ssd-resnet50_v1 80.21% 80.23% -0.03% 4.76x
mxnet 1.7.0 ssd-mobilenet1.0 74.94% 75.54% -0.79% 3.69x
mxnet 1.7.0 resnet152_v1 78.31% 78.54% -0.29% 3.25x

ONNX Models

Framework Version Model Accuracy Performance speed up
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] Realtime Latency Ratio[FP32/INT8]
onnxrt 1.6.0 bert_base_mrpc 85.29% 86.03% -0.85% 2.07x
onnxrt 1.6.0 vgg16 69.43% 69.44% -0.01% 1.20x
onnxrt 1.6.0 ssd_mobilenet_v2 24.02% 24.68% -2.67% 1.09x
onnxrt 1.6.0 distilbert_base_mrpc 85.05% 84.56% 0.58% 2.21x
onnxrt 1.6.0 mobilebert_mrpc 86.03% 86.28% -0.29% 1.21x
onnxrt 1.6.0 roberta_base_mrpc 88.73% 89.46% -0.82% 2.01x