Skip to content

Commit 40ab5a3

Browse files
macszsharathns93XinyuYe-Intel
authored
Enable Transformer LT search space for Dynamic Neural Architecture Search Toolkit (#197)
Signed-off-by: Maciej Szankin <maciej.szankin@intel.com> Co-authored-by: Nittur Sridhar, Sharath <sharath.nittur.sridhar@intel.com> Co-authored-by: Xinyu Ye <xinyu.ye@intel.com>
1 parent d6f417b commit 40ab5a3

22 files changed

+3588
-163
lines changed

.azure-pipelines/scripts/codeScan/pylint/pylint.sh

+2-2
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,13 @@ pip install -r /neural-compressor/requirements.txt
1010
pip install torch==1.12.0
1111

1212
python -m pylint -f json --disable=R,C,W,E1129 --enable=line-too-long --max-line-length=120 --extension-pkg-whitelist=numpy --ignored-classes=TensorProto,NodeProto \
13-
--ignored-modules=tensorflow,torch,torch.quantization,torch.tensor,torchvision,mxnet,onnx,onnxruntime,intel_extension_for_pytorch /neural-compressor/neural_compressor \
13+
--ignored-modules=tensorflow,torch,torch.quantization,torch.tensor,torchvision,fairseq,mxnet,onnx,onnxruntime,intel_extension_for_pytorch /neural-compressor/neural_compressor \
1414
> $log_dir/pylint.json
1515

1616
exit_code=$?
1717

1818
$BOLD_YELLOW && echo " ----------------- Current pylint cmd start --------------------------" && $RESET
19-
echo "python -m pylint -f json --disable=R,C,W,E1129 --enable=line-too-long --max-line-length=120 --extension-pkg-whitelist=numpy --ignored-classes=TensorProto,NodeProto --ignored-modules=tensorflow,torch,torch.quantization,torch.tensor,torchvision,mxnet,onnx,onnxruntime,intel_extension_for_pytorch /neural-compressor/neural_compressor > $log_dir/pylint.json"
19+
echo "python -m pylint -f json --disable=R,C,W,E1129 --enable=line-too-long --max-line-length=120 --extension-pkg-whitelist=numpy --ignored-classes=TensorProto,NodeProto --ignored-modules=tensorflow,torch,torch.quantization,torch.tensor,torchvision,fairseq,mxnet,onnx,onnxruntime,intel_extension_for_pytorch /neural-compressor/neural_compressor > $log_dir/pylint.json"
2020
$BOLD_YELLOW && echo " ----------------- Current pylint cmd end --------------------------" && $RESET
2121

2222
$BOLD_YELLOW && echo " ----------------- Current log file output start --------------------------" && $RESET

.gitignore

+2-1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
.idea
44
/venv/
55
*/__pycache__
6+
.ipynb_checkpoints/
67
*.snapshot
78
*.csv
89
*.pb
@@ -17,4 +18,4 @@ build/
1718
_build
1819
lpot_workspace/
1920
.torch/
20-
node_modules
21+
node_modules

docs/source/NAS.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ class NASBase(object):
8181
8282
def search(self, res_save_path=None):
8383
# NAS search process.
84-
...
84+
...
8585
8686
def estimate(self, model): # pragma: no cover
8787
# Estimate performance of the model. Depends on specific NAS algorithm.
@@ -175,3 +175,5 @@ Following examples are supported in Intel® Neural Compressor:
175175

176176
- DyNAS MobileNetV3 supernet Example:
177177
- [DyNAS MobileNetV3 supernet Example](../examples/notebook/dynas/MobileNetV3_Supernet_NAS.ipynb): DyNAS with MobileNetV3 supernet on ImageNet dataset.
178+
- DyNAS Transformer LT supernet Example:
179+
- [DyNAS Transformer LT supernet Example](../examples/notebook/dynas/Transformer_LT_Supernet_NAS.ipynb): DyNAS with Transformer LT supernet on WMT En-De dataset.

examples/notebook/dynas/MobileNetV3_Supernet_NAS.ipynb

+34-9
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
"\n",
1414
"#### Super-Networks\n",
1515
"\n",
16-
"The computational overhead of evaluating DNN architectures during the neural architecture search process can be very costly due to the training and validation cycles. To address the training overhead, novel weight-sharing approaches known as one-shot or super-networks have offered a way to mitigate the training overhead by reducing training times from thousands to a few GPU days. These approaches train a task-specific super-network architecture with a weight-sharing mechanism that allows the sub-networks to be treated as unique individual architectures. This enables sub-network model extraction and validation without a separate training cycle. This tutorial offers pre-trained Once-for-All (OFA) super-networks [1] for the image classification task on ImageNet-ilsvrc2012.\n",
16+
"The computational overhead of evaluating DNN architectures during the neural architecture search process can be very costly due to the training and validation cycles. To address the training overhead, novel weight-sharing approaches known as one-shot or super-networks have offered a way to mitigate the training overhead by reducing training times from thousands to a few GPU days. These approaches train a task-specific super-network architecture with a weight-sharing mechanism that allows the sub-networks to be treated as unique individual architectures. This enables sub-network model extraction and validation without a separate training cycle. This tutorial offers pre-trained Once-for-All (OFA) super-networks [1] for the image classification on ImageNet-ilsvrc2012 as well as Transformer Language Translation (based on [6]) for the language translation tasks.\n",
1717
"\n",
1818
"#### Methodology\n",
1919
"\n",
@@ -38,7 +38,25 @@
3838
"metadata": {},
3939
"outputs": [],
4040
"source": [
41-
"!pip install neural_compressor autograd==1.4 fvcore==0.1.5.post20220119 numpy==1.19.2 ofa==0.1.0.post202203231606 pandas==1.1.5 pymoo==0.5.0 pyyaml==5.4.1 scikit-learn==0.24.2 scipy==1.5.4 torch==1.10.1 torchvision==0.11.2"
41+
"!pip -q install neural_compressor autograd==1.4 fvcore==0.1.5.post20220119 numpy ofa==0.1.0.post202203231606 pandas==1.1.5 pymoo==0.5.0 pyyaml==5.4.1 scikit-learn==0.24.2 scipy==1.5.4 torch==1.10.1 torchvision==0.11.2"
42+
]
43+
},
44+
{
45+
"cell_type": "markdown",
46+
"metadata": {},
47+
"source": [
48+
"Alternatievely, if you have a local copy of https://github.com/intel/neural-compressor, you can uncomment and run the code below:"
49+
]
50+
},
51+
{
52+
"cell_type": "code",
53+
"execution_count": null,
54+
"metadata": {},
55+
"outputs": [],
56+
"source": [
57+
"# import sys\n",
58+
"# sys.path.insert(0,'<path to neural compressor>')\n",
59+
"# !pip install -q autograd==1.4 fvcore==0.1.5.post20220119 numpy ofa==0.1.0.post202203231606 pandas==1.1.5 pymoo==0.5.0 pyyaml==5.4.1 scikit-learn==0.24.2 scipy==1.5.4 torch==1.10.1 torchvision==0.11.2 sacremoses==0.0.53 torchprofile==0.0.4 fairseq==0.12.2"
4260
]
4361
},
4462
{
@@ -84,12 +102,16 @@
84102
"metadata": {},
85103
"source": [
86104
"### Define Architecture\n",
87-
"We currently leverage pre-trained Once-for-All (OFA) super-networks [4] for the image classification task on ImageNet-ilsvrc2012. In the case where the super-network PyTorch model download fails, you can manually copy the pre-trained models from https://github.com/mit-han-lab/once-for-all and place them in the `.torch/ofa_nets` path. \n",
105+
"We currently support pre-trained super-networks:\n",
106+
"\n",
107+
"1. Once-for-All (OFA) super-networks [4] for the image classification task on ImageNet-ilsvrc2012. In the case where the super-network PyTorch model download fails, you can manually copy the pre-trained models from https://github.com/mit-han-lab/once-for-all and place them in the `.torch/ofa_nets` path.\n",
108+
"2. Hardware-Aware-Transformers (HAT) supernetwork [6] for language translation task on WMT14 En-De. To run this supernetwork you have to manually download preprocessed dataset from https://github.com/mit-han-lab/hardware-aware-transformers/blob/master/configs/wmt14.en-de/get_preprocessed.sh and pretrained model from https://www.dropbox.com/s/pkdddxvvpw9a4vq/HAT_wmt14ende_super_space0.pt?dl=0\n",
88109
"\n",
89110
"Super-network options (choose 1): \n",
90111
"- `ofa_resnet50` - based on the ResNet50 architecture [4]. Search space of ~$10^{15}$ architectures.\n",
91112
"- `ofa_mbv3_d234_e346_k357_w1.0` - based on the MobileNetV3 architecture [5], width multiplier 1.0. Search space of ~$10^{19}$ architectures.\n",
92-
"- `ofa_mbv3_d234_e346_k357_w1.2` - based on the MobileNetV3 architecture [5], width multiplier 1.2. Search space of ~$10^{19}$ architectures. "
113+
"- `ofa_mbv3_d234_e346_k357_w1.2` - based on the MobileNetV3 architecture [5], width multiplier 1.2. Search space of ~$10^{19}$ architectures. \n",
114+
"- `transformer_lt_wmt_en_de` - based on the Transformer architecture [7]."
93115
]
94116
},
95117
{
@@ -113,7 +135,7 @@
113135
"* `['acc', 'lat']` \n",
114136
"\n",
115137
"Description:\n",
116-
"* `'acc'` - ImageNet Top-1 Accuracy (%)\n",
138+
"* `'acc'` - ImageNet Top-1 Accuracy (%) (for OFA supetnetworks) and Bleu (for Transformer LT)\n",
117139
"* `'macs'` - Multiply-and-accumulates as measured from FVCore. \n",
118140
"* `'lat'` - Latency (inference time) measurement (ms)"
119141
]
@@ -137,7 +159,8 @@
137159
"* `config.dynas.num_evals` - Validation measurement count, a higher count comes with greater computational cost but a higher chance of finding optimal sub-networks\n",
138160
"* `config.dynas.results_csv_path` - Location of the search (validation measurement) results. This file is also used to provide training data to the metric predictors. \n",
139161
"* `config.dynas.batch_size` - Batch size used during latency measurements.\n",
140-
"* `config.dynas.dataset_path` - Path to the imagenet-ilsvrc2012 dataset. This can be obtained at: https://www.image-net.org/download.php"
162+
"* `config.dynas.dataset_path` - For OFA it's a path to the imagenet-ilsvrc2012 dataset. This can be obtained at: https://www.image-net.org/download.php; For Transformer LT it's a path to preprocessed WMT EnDe directory (`(...)/data/binary/wmt16_en_de`)\n",
163+
"* `config.dynas.supernet_ckpt_path` - Transformer LT only. Path to downloaded pretrained super-network (`HAT_wmt14ende_super_space0.pt` file)."
141164
]
142165
},
143166
{
@@ -272,8 +295,10 @@
272295
"[1] Cai, H., Gan, C., & Han, S. (2020). Once for All: Train One Network and Specialize it for Efficient Deployment. ArXiv, abs/1908.09791. \n",
273296
"[2] K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, \"A fast and elitist multiobjective genetic algorithm: NSGA-II,\" in IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182-197, April 2002, doi: 10.1109/4235.996017. \n",
274297
"[3] Cummings, D., Sarah, A., Sridhar, S.N., Szankin, M., Muñoz, J.P., & Sundaresan, S. (2022). A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities. ArXiv, abs/2205.10358. \n",
275-
"[4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. \n",
276-
"[5] Howard, A.G., Sandler, M., Chu, G., Chen, L., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., & Adam, H. (2019). Searching for MobileNetV3. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 1314-1324. "
298+
"[4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. \n",
299+
"[5] Howard, A.G., Sandler, M., Chu, G., Chen, L., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., & Adam, H. (2019). Searching for MobileNetV3. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 1314-1324. \n",
300+
"[6] Wang, H., Wu, Z., Liu, Z., Cai, H., Zhu, L., Gan, C. and Han, S., 2020. Hat: Hardware-aware transformers for efficient natural language processing. arXiv preprint arXiv:2005.14187. \n",
301+
"[7] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I., 2017. Attention is all you need. Advances in neural information processing systems, 30."
277302
]
278303
},
279304
{
@@ -300,7 +325,7 @@
300325
"name": "python",
301326
"nbconvert_exporter": "python",
302327
"pygments_lexer": "ipython3",
303-
"version": "3.8.10"
328+
"version": "3.7.11"
304329
}
305330
},
306331
"nbformat": 4,

0 commit comments

Comments
 (0)