|
13 | 13 | "\n",
|
14 | 14 | "#### Super-Networks\n",
|
15 | 15 | "\n",
|
16 |
| - "The computational overhead of evaluating DNN architectures during the neural architecture search process can be very costly due to the training and validation cycles. To address the training overhead, novel weight-sharing approaches known as one-shot or super-networks have offered a way to mitigate the training overhead by reducing training times from thousands to a few GPU days. These approaches train a task-specific super-network architecture with a weight-sharing mechanism that allows the sub-networks to be treated as unique individual architectures. This enables sub-network model extraction and validation without a separate training cycle. This tutorial offers pre-trained Once-for-All (OFA) super-networks [1] for the image classification task on ImageNet-ilsvrc2012.\n", |
| 16 | + "The computational overhead of evaluating DNN architectures during the neural architecture search process can be very costly due to the training and validation cycles. To address the training overhead, novel weight-sharing approaches known as one-shot or super-networks have offered a way to mitigate the training overhead by reducing training times from thousands to a few GPU days. These approaches train a task-specific super-network architecture with a weight-sharing mechanism that allows the sub-networks to be treated as unique individual architectures. This enables sub-network model extraction and validation without a separate training cycle. This tutorial offers pre-trained Once-for-All (OFA) super-networks [1] for the image classification on ImageNet-ilsvrc2012 as well as Transformer Language Translation (based on [6]) for the language translation tasks.\n", |
17 | 17 | "\n",
|
18 | 18 | "#### Methodology\n",
|
19 | 19 | "\n",
|
|
38 | 38 | "metadata": {},
|
39 | 39 | "outputs": [],
|
40 | 40 | "source": [
|
41 |
| - "!pip install neural_compressor autograd==1.4 fvcore==0.1.5.post20220119 numpy==1.19.2 ofa==0.1.0.post202203231606 pandas==1.1.5 pymoo==0.5.0 pyyaml==5.4.1 scikit-learn==0.24.2 scipy==1.5.4 torch==1.10.1 torchvision==0.11.2" |
| 41 | + "!pip -q install neural_compressor autograd==1.4 fvcore==0.1.5.post20220119 numpy ofa==0.1.0.post202203231606 pandas==1.1.5 pymoo==0.5.0 pyyaml==5.4.1 scikit-learn==0.24.2 scipy==1.5.4 torch==1.10.1 torchvision==0.11.2" |
| 42 | + ] |
| 43 | + }, |
| 44 | + { |
| 45 | + "cell_type": "markdown", |
| 46 | + "metadata": {}, |
| 47 | + "source": [ |
| 48 | + "Alternatievely, if you have a local copy of https://github.com/intel/neural-compressor, you can uncomment and run the code below:" |
| 49 | + ] |
| 50 | + }, |
| 51 | + { |
| 52 | + "cell_type": "code", |
| 53 | + "execution_count": null, |
| 54 | + "metadata": {}, |
| 55 | + "outputs": [], |
| 56 | + "source": [ |
| 57 | + "# import sys\n", |
| 58 | + "# sys.path.insert(0,'<path to neural compressor>')\n", |
| 59 | + "# !pip install -q autograd==1.4 fvcore==0.1.5.post20220119 numpy ofa==0.1.0.post202203231606 pandas==1.1.5 pymoo==0.5.0 pyyaml==5.4.1 scikit-learn==0.24.2 scipy==1.5.4 torch==1.10.1 torchvision==0.11.2 sacremoses==0.0.53 torchprofile==0.0.4 fairseq==0.12.2" |
42 | 60 | ]
|
43 | 61 | },
|
44 | 62 | {
|
|
84 | 102 | "metadata": {},
|
85 | 103 | "source": [
|
86 | 104 | "### Define Architecture\n",
|
87 |
| - "We currently leverage pre-trained Once-for-All (OFA) super-networks [4] for the image classification task on ImageNet-ilsvrc2012. In the case where the super-network PyTorch model download fails, you can manually copy the pre-trained models from https://github.com/mit-han-lab/once-for-all and place them in the `.torch/ofa_nets` path. \n", |
| 105 | + "We currently support pre-trained super-networks:\n", |
| 106 | + "\n", |
| 107 | + "1. Once-for-All (OFA) super-networks [4] for the image classification task on ImageNet-ilsvrc2012. In the case where the super-network PyTorch model download fails, you can manually copy the pre-trained models from https://github.com/mit-han-lab/once-for-all and place them in the `.torch/ofa_nets` path.\n", |
| 108 | + "2. Hardware-Aware-Transformers (HAT) supernetwork [6] for language translation task on WMT14 En-De. To run this supernetwork you have to manually download preprocessed dataset from https://github.com/mit-han-lab/hardware-aware-transformers/blob/master/configs/wmt14.en-de/get_preprocessed.sh and pretrained model from https://www.dropbox.com/s/pkdddxvvpw9a4vq/HAT_wmt14ende_super_space0.pt?dl=0\n", |
88 | 109 | "\n",
|
89 | 110 | "Super-network options (choose 1): \n",
|
90 | 111 | "- `ofa_resnet50` - based on the ResNet50 architecture [4]. Search space of ~$10^{15}$ architectures.\n",
|
91 | 112 | "- `ofa_mbv3_d234_e346_k357_w1.0` - based on the MobileNetV3 architecture [5], width multiplier 1.0. Search space of ~$10^{19}$ architectures.\n",
|
92 |
| - "- `ofa_mbv3_d234_e346_k357_w1.2` - based on the MobileNetV3 architecture [5], width multiplier 1.2. Search space of ~$10^{19}$ architectures. " |
| 113 | + "- `ofa_mbv3_d234_e346_k357_w1.2` - based on the MobileNetV3 architecture [5], width multiplier 1.2. Search space of ~$10^{19}$ architectures. \n", |
| 114 | + "- `transformer_lt_wmt_en_de` - based on the Transformer architecture [7]." |
93 | 115 | ]
|
94 | 116 | },
|
95 | 117 | {
|
|
113 | 135 | "* `['acc', 'lat']` \n",
|
114 | 136 | "\n",
|
115 | 137 | "Description:\n",
|
116 |
| - "* `'acc'` - ImageNet Top-1 Accuracy (%)\n", |
| 138 | + "* `'acc'` - ImageNet Top-1 Accuracy (%) (for OFA supetnetworks) and Bleu (for Transformer LT)\n", |
117 | 139 | "* `'macs'` - Multiply-and-accumulates as measured from FVCore. \n",
|
118 | 140 | "* `'lat'` - Latency (inference time) measurement (ms)"
|
119 | 141 | ]
|
|
137 | 159 | "* `config.dynas.num_evals` - Validation measurement count, a higher count comes with greater computational cost but a higher chance of finding optimal sub-networks\n",
|
138 | 160 | "* `config.dynas.results_csv_path` - Location of the search (validation measurement) results. This file is also used to provide training data to the metric predictors. \n",
|
139 | 161 | "* `config.dynas.batch_size` - Batch size used during latency measurements.\n",
|
140 |
| - "* `config.dynas.dataset_path` - Path to the imagenet-ilsvrc2012 dataset. This can be obtained at: https://www.image-net.org/download.php" |
| 162 | + "* `config.dynas.dataset_path` - For OFA it's a path to the imagenet-ilsvrc2012 dataset. This can be obtained at: https://www.image-net.org/download.php; For Transformer LT it's a path to preprocessed WMT EnDe directory (`(...)/data/binary/wmt16_en_de`)\n", |
| 163 | + "* `config.dynas.supernet_ckpt_path` - Transformer LT only. Path to downloaded pretrained super-network (`HAT_wmt14ende_super_space0.pt` file)." |
141 | 164 | ]
|
142 | 165 | },
|
143 | 166 | {
|
|
272 | 295 | "[1] Cai, H., Gan, C., & Han, S. (2020). Once for All: Train One Network and Specialize it for Efficient Deployment. ArXiv, abs/1908.09791. \n",
|
273 | 296 | "[2] K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, \"A fast and elitist multiobjective genetic algorithm: NSGA-II,\" in IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182-197, April 2002, doi: 10.1109/4235.996017. \n",
|
274 | 297 | "[3] Cummings, D., Sarah, A., Sridhar, S.N., Szankin, M., Muñoz, J.P., & Sundaresan, S. (2022). A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities. ArXiv, abs/2205.10358. \n",
|
275 |
| - "[4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. \n", |
276 |
| - "[5] Howard, A.G., Sandler, M., Chu, G., Chen, L., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., & Adam, H. (2019). Searching for MobileNetV3. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 1314-1324. " |
| 298 | + "[4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778. \n", |
| 299 | + "[5] Howard, A.G., Sandler, M., Chu, G., Chen, L., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., & Adam, H. (2019). Searching for MobileNetV3. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 1314-1324. \n", |
| 300 | + "[6] Wang, H., Wu, Z., Liu, Z., Cai, H., Zhu, L., Gan, C. and Han, S., 2020. Hat: Hardware-aware transformers for efficient natural language processing. arXiv preprint arXiv:2005.14187. \n", |
| 301 | + "[7] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I., 2017. Attention is all you need. Advances in neural information processing systems, 30." |
277 | 302 | ]
|
278 | 303 | },
|
279 | 304 | {
|
|
300 | 325 | "name": "python",
|
301 | 326 | "nbconvert_exporter": "python",
|
302 | 327 | "pygments_lexer": "ipython3",
|
303 |
| - "version": "3.8.10" |
| 328 | + "version": "3.7.11" |
304 | 329 | }
|
305 | 330 | },
|
306 | 331 | "nbformat": 4,
|
|
0 commit comments