The below tables are models enabled by the Intel® Low Precision Optimization Tool.
Framework | Version | Model | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | |||
tensorflow |
2.4.0 |
resnet50v1.0 |
74.21% |
74.27% |
-0.08% |
3.44x |
tensorflow |
2.4.0 |
resnet50v1.5 |
76.92% |
76.46% |
0.60% |
3.37x |
tensorflow |
2.4.0 |
resnet101 |
77.18% |
76.45% |
0.95% |
2.53x |
tensorflow |
2.4.0 |
inception_v1 |
70.41% |
69.74% |
0.96% |
1.89x |
tensorflow |
2.4.0 |
inception_v2 |
74.36% |
73.97% |
0.53% |
1.95x |
tensorflow |
2.4.0 |
inception_v3 |
77.28% |
76.75% |
0.69% |
2.37x |
tensorflow |
2.4.0 |
inception_v4 |
80.39% |
80.27% |
0.15% |
2.60x |
tensorflow |
2.4.0 |
inception_resnet_v2 |
80.38% |
80.40% |
-0.02% |
1.98x |
tensorflow |
2.4.0 |
mobilenetv1 |
73.29% |
70.96% |
3.28% |
2.93x |
tensorflow |
2.4.0 |
mobilenetv2 |
71.98% |
71.76% |
0.31% |
1.78x |
tensorflow |
2.4.0 |
ssd_resnet50_v1 |
37.98% |
38.00% |
-0.05% |
2.99x |
tensorflow |
2.4.0 |
mask_rcnn_inception_v2 |
28.62% |
28.73% |
-0.38% |
2.96x |
tensorflow |
2.4.0 |
wide_deep_large_ds |
77.61% |
77.67% |
-0.08% |
1.50x |
tensorflow |
2.4.0 |
vgg16 |
72.11% |
70.89% |
1.72% |
3.76x |
tensorflow |
2.4.0 |
vgg19 |
72.36% |
71.01% |
1.90% |
3.85x |
tensorflow |
2.4.0 |
resnetv2_50 |
70.39% |
69.64% |
1.08% |
1.40x |
tensorflow |
2.4.0 |
resnetv2_101 |
72.58% |
71.87% |
0.99% |
1.51x |
tensorflow |
2.4.0 |
resnetv2_152 |
72.92% |
72.37% |
0.76% |
1.48x |
tensorflow |
2.4.0 |
densenet121 |
72.79% |
72.89% |
-0.14% |
1.58x |
tensorflow |
2.4.0 |
densenet161 |
76.41% |
76.29% |
0.16% |
1.79x |
tensorflow |
2.4.0 |
densenet169 |
74.55% |
74.65% |
-0.13% |
1.49x |
tensorflow |
2.4.0 |
efficientnet_b0 |
78.40% |
76.75% |
2.15% |
1.13x |
tensorflow |
2.4.0 |
deeplab |
81.96% |
82.20% |
-0.29% |
1.55x |
Framework | Version | Model | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | |||
tensorflow |
1.15up2 |
resnet_v1_50_slim |
76.06% |
75.18% |
1.17% |
2.94x |
tensorflow |
1.15up2 |
resnet_v1_101_slim |
77.19% |
76.40% |
1.03% |
3.39x |
tensorflow |
1.15up2 |
resnet_v1_152_slim |
77.58% |
76.81% |
1.00% |
3.74x |
tensorflow |
1.15up2 |
inception_v1_slim |
70.44% |
69.77% |
0.96% |
1.95x |
tensorflow |
1.15up2 |
inception_v2_slim |
74.32% |
73.98% |
0.46% |
2.00x |
tensorflow |
1.15up2 |
inception_v3_slim |
78.30% |
77.99% |
0.40% |
2.57x |
tensorflow |
1.15up2 |
inception_v4_slim |
80.30% |
80.19% |
0.14% |
2.88x |
tensorflow |
1.15up2 |
vgg16_slim |
72.16% |
70.89% |
1.79% |
3.81x |
tensorflow |
1.15up2 |
vgg19_slim |
72.29% |
71.01% |
1.80% |
3.88x |
tensorflow |
1.15up2 |
resnetv2_50_slim |
70.35% |
69.72% |
0.90% |
1.41x |
tensorflow |
1.15up2 |
resnetv2_101_slim |
72.49% |
71.91% |
0.81% |
1.54x |
tensorflow |
1.15up2 |
resnetv2_152_slim |
72.90% |
72.40% |
0.69% |
1.60x |
tensorflow |
1.15up2 |
bert_large_squad |
92.35 |
92.98 |
-0.68% |
2.81x |
tensorflow |
1.15up2 |
bert_base_mrpc |
85.78% |
86.52% |
-0.86% |
1.50x |
Framework | Version | Model | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | |||
pytorch | 1.5.0+cpu | resnet18 | 69.60% | 69.76% | -0.22% | 2.00x |
pytorch | 1.5.0+cpu | resnet50 | 75.96% | 76.13% | -0.23% | 2.46x |
pytorch | 1.5.0+cpu | resnext101_32x8d | 79.12% | 79.31% | -0.24% | 2.63x |
pytorch | 1.5.0+cpu | inception_v3 | 69.42% | 69.54% | -0.17% | 1.96x |
pytorch | 1.5.0+cpu | peleenet | 71.59% | 72.08% | -0.68% | 1.43x |
pytorch | 1.5.0+cpu | yolo_v3 | 24.42% | 24.54% | -0.51% | 1.74x |
pytorch | 1.5.0+cpu | se_resnext50_32x4d | 79.04% | 79.08% | -0.05% | 1.87x |
pytorch | 1.5.0+cpu | mobilenet_v2 | 70.63% | 71.86% | -1.70% | 1.75x |
pytorch | 1.5.0+cpu | 3dunet | 85.31% | 85.30% | 0.01% | 1.84x |
pytorch | 1.5.0+cpu | distilbert_base_mrpc | 81.34% | 80.99% | 0.43% | 1.78x |
pytorch | 1.5.0+cpu | albert_base_mrpc | 88.34% | 88.50% | -0.18% | 1.48x |
pytorch | 1.5.0+cpu | funnel_mrpc | 91.79% | 92.26% | -0.51% | 1.40x |
pytorch | 1.5.0+cpu | mbart_wnli | 56.34% | 56.34% | 0.00% | 2.06x |
pytorch | 1.5.0+cpu | t5_wmt_en_ro | 24.36 | 24.52 | -0.66% | 1.60x |
pytorch | 1.5.0+cpu | marianmt_wmt_en_ro | 22.33 | 22.23 | 0.46% | 2.11x |
pytorch | 1.5.0+cpu | pegasus_billsum | 51.1 | 51.21 | -0.23% | 2.00x |
pytorch | 1.5.0+cpu | rnnt | 91.55 | 92.55 | -1.08% | 3.10x |
pytorch | 1.5.0+cpu | reformer_crime_and_punishment | 6.55 | 6.5 | 0.79% | 1.07x |
pytorch | 1.5.0+cpu | xlm-roberta-base_mrpc | 87.93% | 88.62% | -0.78% | 1.34x |
pytorch | 1.5.0+cpu | flaubert_mrpc | 80.69% | 80.19% | 0.62% | 1.43x |
pytorch | 1.5.0+cpu | barthez_mrpc | 83.23% | 83.81% | -0.68% | 1.81x |
pytorch | 1.5.0+cpu | longformer_mrpc | 90.65% | 91.46% | -0.88% | 1.30x |
pytorch | 1.5.0+cpu | layoutlm_mrpc | 81.22% | 78.01% | 4.12% | 1.86x |
pytorch | 1.5.0+cpu | deberta_mrpc | 91.07% | 90.91% | 0.17% | 1.46x |
pytorch | 1.5.0+cpu | dlrm_fx | 80.19% | 80.27% | -0.10% | 1.25x |
pytorch | 1.5.0+cpu | resnet18_fx | 69.61% | 69.76% | -0.22% | 2.17x |
pytorch | 1.5.0+cpu | xlm_roberta_mrpc | 88.47% | 88.24% | 0.27% | 1.76x |
pytorch | 1.5.0+cpu | xlnet_base_mrpc | 89.62% | 89.47% | 0.17% | 1.27x |
pytorch | 1.5.0+cpu | transfo_xl_mrpc | 81.74% | 81.20% | 0.66% | 1.38x |
pytorch | 1.5.0+cpu | ctrl_mrpc | 81.76% | 82.00% | -0.29% | 2.49x |
pytorch | 1.6.0a0+24aac32 | bert_base_mrpc | 88.90% | 88.73% | 0.19% | 2.10x |
pytorch | 1.6.0a0+24aac32 | bert_base_cola | 59.06% | 58.84% | 0.37% | 2.23x |
pytorch | 1.6.0a0+24aac32 | bert_base_sts-b | 88.40% | 89.27% | -0.97% | 2.13x |
pytorch | 1.6.0a0+24aac32 | bert_base_sst-2 | 91.51% | 91.86% | -0.37% | 2.32x |
pytorch | 1.6.0a0+24aac32 | bert_base_rte | 69.31% | 69.68% | -0.52% | 2.03x |
pytorch | 1.6.0a0+24aac32 | bert_large_mrpc | 87.45% | 88.33% | -0.99% | 2.65x |
pytorch | 1.6.0a0+24aac32 | bert_large_squad | 92.85 | 93.05 | -0.21% | 1.92x |
pytorch | 1.6.0a0+24aac32 | bert_large_qnli | 91.20% | 91.82% | -0.68% | 2.59x |
pytorch | 1.6.0a0+24aac32 | bert_large_rte | 71.84% | 72.56% | -0.99% | 1.34x |
pytorch | 1.6.0a0+24aac32 | bert_large_cola | 62.74% | 62.57% | 0.27% | 2.67x |
pytorch | 1.6.0a0+24aac32 | gpt_wikitext | 60.06 | 60.2 | -0.23% | 1.15x |
pytorch | 1.6.0a0+24aac32 | roberta_base_mrpc | 85.08% | 85.51% | -0.51% | 2.12x |
pytorch | 1.6.0a0+24aac32 | camembert_base_mrpc | 83.57% | 84.22% | -0.77% | 2.18x |
Framework | Version | Model | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | |||
pytorch | 1.5.0+cpu | resnet18_qat | 69.76% | 69.76% | 0.01% | 2.05x |
pytorch | 1.5.0+cpu | resnet50_qat | 76.37% | 76.13% | 0.32% | 2.56x |
Framework | Version | Model | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | |||
mxnet | 1.7.0 | resnet50v1 | 76.03% | 76.33% | -0.39% | 3.23x |
mxnet | 1.7.0 | inceptionv3 | 77.80% | 77.64% | 0.21% | 2.73x |
mxnet | 1.7.0 | mobilenet1.0 | 71.71% | 72.22% | -0.71% | 2.51x |
mxnet | 1.7.0 | mobilenetv2_1.0 | 70.77% | 70.87% | -0.14% | 2.63x |
mxnet | 1.7.0 | resnet18_v1 | 70.00% | 70.14% | -0.21% | 3.12x |
mxnet | 1.7.0 | squeezenet1.0 | 56.89% | 56.96% | -0.13% | 2.61x |
mxnet | 1.7.0 | ssd-resnet50_v1 | 80.21% | 80.23% | -0.03% | 4.76x |
mxnet | 1.7.0 | ssd-mobilenet1.0 | 74.94% | 75.54% | -0.79% | 3.69x |
mxnet | 1.7.0 | resnet152_v1 | 78.31% | 78.54% | -0.29% | 3.25x |
Framework | Version | Model | Accuracy | Performance speed up | ||
---|---|---|---|---|---|---|
INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio [(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | |||
onnxrt | 1.6.0 | bert_base_mrpc | 85.29% | 86.03% | -0.85% | 2.07x |
onnxrt | 1.6.0 | vgg16 | 69.43% | 69.44% | -0.01% | 1.20x |
onnxrt | 1.6.0 | ssd_mobilenet_v2 | 24.02% | 24.68% | -2.67% | 1.09x |
onnxrt | 1.6.0 | distilbert_base_mrpc | 85.05% | 84.56% | 0.58% | 2.21x |
onnxrt | 1.6.0 | mobilebert_mrpc | 86.03% | 86.28% | -0.29% | 1.21x |
onnxrt | 1.6.0 | roberta_base_mrpc | 88.73% | 89.46% | -0.82% | 2.01x |