Skip to content
This repository was archived by the owner on Feb 22, 2024. It is now read-only.

Latest commit




Folders and files

Last commit message
Last commit date

parent directory




This pipeline provides instructions on how to run inference using BERT Base model on infrastructure provided by Azure Machine Learning with make and docker compose.

Project Structure

├── azureml @ v1.0.1
├── Makefile
└── docker-compose.yml


AZURE_CONFIG_FILE ?= $$(pwd)/config.json
FP32_TRAINED_MODEL ?= $$(pwd)/../training/azureml/notebooks/fp32_model_output

	mkdir -p ./azureml/notebooks/fp32_model_output && cp -r ${FP32_TRAINED_MODEL} ./azureml/notebooks/
	docker compose up nlp-azure --build

	docker compose down
	rm -rf ./azureml/notebooks/fp32_model_output


        http_proxy: ${http_proxy}
        https_proxy: ${https_proxy}
        no_proxy: ${no_proxy}
      dockerfile: ./azureml/Dockerfile
    command: sh -c "jupyter nbconvert --to python 1.0-intel-azureml-inference.ipynb && python3"
      - http_proxy=${http_proxy}
      - https_proxy=${https_proxy}
      - no_proxy=${no_proxy}
    image: ${FINAL_IMAGE_NAME}:inference-ubuntu-20.04
    network_mode: "host"
    privileged: true
      - ./azureml/notebooks:/root/notebooks
      - ./azureml/src:/root/src
      - /${AZURE_CONFIG_FILE}:/root/notebooks/config.json
    working_dir: /root/notebooks

Azure Machine Learning

End-to-End AI workflow using the Azure ML Cloud Infrastructure for executing inference using the BERT Base model. More Information here. The pipeline runs the 1.0-intel-azureml-inference.ipynb of the Azure ML project.

Quick Start

  • Make sure that the enviroment setup pre-requisites are satisfied per the document here.

  • Pull and configure the dependent repo submodule git submodule update --init --recursive .

  • Install Pipeline Repository Dependencies.

  • Use the quickstart link to setup your Azure ML resources.

    • If required, create virtual networks and NAT gateway by following this link.
  • Download the config.json file from your Azure ML Studio Workspace.

  • This pipeline requires the pre-trained FP32 model. Please run the training pipeline before running inference to get the model.

  • Other Variables:

Variable Name Default Notes
AZURE_CONFIG_FILE $$(pwd)/config.json Azure Workspace Configuration file
FINAL_IMAGE_NAME nlp-azure Final Docker Image Name
FP32_TRAINED_MODEL $$(pwd)/../training/azureml/notebooks/fp32_model_output FP32 model obtained from Training

Build and Run

Build and run with defaults:

make nlp-azure

Build and Run Example

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 32B done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load metadata for
#3 DONE 0.7s

#4 [1/3] FROM
#4 DONE 0.0s

#5 [2/3] RUN apt-get update &&     apt-get install --no-install-recommends curl=7.68.0-1ubuntu2.13 -y &&     apt-get install --no-install-recommends python3-pip=20.0.2-5ubuntu1.6 -y &&     rm -r /var/lib/apt/lists/*

#6 [3/3] RUN pip install --no-cache-dir azureml-sdk==1.45.0 && pip install --no-cache-dir notebook==6.4.12

#7 exporting to image
#7 exporting layers done
#7 writing image sha256:b4b0d17ff3f251644447a83a133d0d41a7f42129b05739ba4d843ecced862eeb done
#7 naming to done
#7 DONE 0.0s
Attaching to inference-nlp-azure-1
inference-nlp-azure-1  | [NbConvertApp] Converting notebook 1.0-intel-azureml-inference.ipynb to python
inference-nlp-azure-1  | [NbConvertApp] Writing 9806 bytes to
inference-nlp-azure-1  | Failure while loading azureml_run_type_providers. Failed to load entrypoint hyperdrive = azureml.train.hyperdrive:HyperDriveRun._from_run_dto with exception (cryptography 37.0.4 (/usr/local/lib/python3.8/dist-packages), Requirement.parse('cryptography<39,>=38.0.0'), {'pyopenssl', 'PyOpenSSL'}).
inference-nlp-azure-1  | Failure while loading azureml_run_type_providers. Failed to load entrypoint automl = with exception (cryptography 37.0.4 (/usr/local/lib/python3.8/dist-packages), Requirement.parse('cryptography<39,>=38.0.0'), {'pyopenssl', 'PyOpenSSL'}).
inference-nlp-azure-1  | Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.PipelineRun = with exception (cryptography 37.0.4 (/usr/local/lib/python3.8/dist-packages), Requirement.parse('cryptography<39,>=38.0.0'), {'pyopenssl', 'PyOpenSSL'}).
inference-nlp-azure-1  | Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.ReusedStepRun = with exception (cryptography 37.0.4 (/usr/local/lib/python3.8/dist-packages), Requirement.parse('cryptography<39,>=38.0.0'), {'pyopenssl', 'PyOpenSSL'}).
inference-nlp-azure-1  | Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.StepRun = with exception (cryptography 37.0.4 (/usr/local/lib/python3.8/dist-packages), Requirement.parse('cryptography<39,>=38.0.0'), {'pyopenssl', 'PyOpenSSL'}).
inference-nlp-azure-1  | Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.scriptrun = azureml.core.script_run:ScriptRun._from_run_dto with exception (cryptography 37.0.4 (/usr/local/lib/python3.8/dist-packages), Requirement.parse('cryptography<39,>=38.0.0'), {'pyopenssl', 'PyOpenSSL'}).
inference-nlp-azure-1  | Loaded existing workspace configuration
inference-nlp-azure-1  | Validating arguments.
inference-nlp-azure-1  | Arguments validated.
inference-nlp-azure-1  | Uploading file to /inc/ptq_config
inference-nlp-azure-1  | Uploading an estimated of 1 files
inference-nlp-azure-1  | Uploading ../src/inference_container/config/ptq.yaml
inference-nlp-azure-1  | Uploaded ../src/inference_container/config/ptq.yaml, 1 files out of an estimated total of 1
inference-nlp-azure-1  | Uploaded 1 files
inference-nlp-azure-1  | Creating new dataset
inference-nlp-azure-1  | Validating arguments.
inference-nlp-azure-1  | Arguments validated.
inference-nlp-azure-1  | Uploading file to /trained_fp32_hf_model
inference-nlp-azure-1  | Uploading an estimated of 11 files
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/training_args.bin
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/training_args.bin, 1 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/config.json
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/config.json, 2 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/training_args.bin
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/training_args.bin, 3 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/trainer_state.json
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/trainer_state.json, 4 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/, 5 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/config.json
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/config.json, 6 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/rng_state_0.pth
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/rng_state_0.pth, 7 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/rng_state_1.pth
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/rng_state_1.pth, 8 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/pytorch_model.bin
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/pytorch_model.bin, 9 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/pytorch_model.bin
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/pytorch_model.bin, 10 files out of an estimated total of 11
inference-nlp-azure-1  | Uploading ./fp32_model_output/outputs/trained_model/checkpoint-500/
inference-nlp-azure-1  | Uploaded ./fp32_model_output/outputs/trained_model/checkpoint-500/, 11 files out of an estimated total of 11
inference-nlp-azure-1  | Uploaded 11 files
inference-nlp-azure-1  | Creating new dataset
inference-nlp-azure-1  | Found existing cluster, use it.
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Running
inference-nlp-azure-1  | RunId: INC_PTQ_1666128985_788b95f3
inference-nlp-azure-1  | Web View:
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Streaming user_logs/std_log.txt
inference-nlp-azure-1  | ===============================
inference-nlp-azure-1  | 
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading builder script:   0%|          | 0.00/7.78k [00:00<?, ?B/s]
inference-nlp-azure-1  | Downloading builder script: 28.8kB [00:00, 12.3MB/s]                   
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading metadata:   0%|          | 0.00/4.47k [00:00<?, ?B/s]
inference-nlp-azure-1  | Downloading metadata: 28.7kB [00:00, 14.7MB/s]                   
inference-nlp-azure-1  | Downloading and preparing dataset glue/mrpc (download: 1.43 MiB, generated: 1.43 MiB, post-processed: Unknown size, total: 2.85 MiB) to /root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad...
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading data files:   0%|          | 0/3 [00:00<?, ?it/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading data: 0.00B [00:00, ?B/s]�[A
inference-nlp-azure-1  | Downloading data: 6.22kB [00:00, 4.12MB/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading data files:  33%|███▎      | 1/3 [00:00<00:00,  2.24it/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading data: 0.00B [00:00, ?B/s]�[A
inference-nlp-azure-1  | Downloading data: 1.05MB [00:00, 18.6MB/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading data files:  67%|██████▋   | 2/3 [00:00<00:00,  2.30it/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading data: 0.00B [00:00, ?B/s]�[A
inference-nlp-azure-1  | Downloading data: 441kB [00:00, 13.3MB/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading data files: 100%|██████████| 3/3 [00:01<00:00,  2.50it/s]
inference-nlp-azure-1  | Downloading data files: 100%|██████████| 3/3 [00:01<00:00,  2.43it/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Generating train split:   0%|          | 0/3668 [00:00<?, ? examples/s]
inference-nlp-azure-1  | Generating train split:  38%|███▊      | 1390/3668 [00:00<00:00, 13893.19 examples/s]
inference-nlp-azure-1  | Generating train split:  78%|███████▊  | 2867/3668 [00:00<00:00, 14406.34 examples/s]
inference-nlp-azure-1  |                                                                                      
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Generating validation split:   0%|          | 0/408 [00:00<?, ? examples/s]
inference-nlp-azure-1  | Generating validation split:  96%|█████████▌| 391/408 [00:00<00:00, 3869.76 examples/s]
inference-nlp-azure-1  |                                                                                        
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Generating test split:   0%|          | 0/1725 [00:00<?, ? examples/s]
inference-nlp-azure-1  |                                                                       
inference-nlp-azure-1  | Dataset glue downloaded and prepared to /root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad. Subsequent calls will reuse this data.
inference-nlp-azure-1  | 
inference-nlp-azure-1  |   0%|          | 0/2 [00:00<?, ?it/s]
inference-nlp-azure-1  | 100%|██████████| 2/2 [00:00<00:00, 708.80it/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  |   0%|          | 0/4 [00:00<?, ?ba/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading tokenizer_config.json:   0%|          | 0.00/28.0 [00:00<?, ?B/s]�[A
inference-nlp-azure-1  | Downloading tokenizer_config.json: 100%|██████████| 28.0/28.0 [00:00<00:00, 23.8kB/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]�[A
inference-nlp-azure-1  | Downloading config.json: 100%|██████████| 570/570 [00:00<00:00, 457kB/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading vocab.txt:   0%|          | 0.00/226k [00:00<?, ?B/s]�[A
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading vocab.txt:  12%|█▏        | 28.0k/226k [00:00<00:01, 195kB/s]�[A
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading vocab.txt:  69%|██████▉   | 157k/226k [00:00<00:00, 605kB/s] �[A
inference-nlp-azure-1  | Downloading vocab.txt: 100%|██████████| 226k/226k [00:00<00:00, 775kB/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading tokenizer.json:   0%|          | 0.00/455k [00:00<?, ?B/s]�[A
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading tokenizer.json:   9%|▉         | 40.0k/455k [00:00<00:01, 277kB/s]�[A
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading tokenizer.json:  24%|██▎       | 108k/455k [00:00<00:00, 391kB/s] �[A
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading tokenizer.json:  91%|█████████ | 412k/455k [00:00<00:00, 1.17MB/s]�[A
inference-nlp-azure-1  | Downloading tokenizer.json: 100%|██████████| 455k/455k [00:00<00:00, 1.04MB/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  |  25%|██▌       | 1/4 [00:05<00:15,  5.31s/ba]
inference-nlp-azure-1  |  50%|█████     | 2/4 [00:08<00:07,  3.99s/ba]
inference-nlp-azure-1  |  75%|███████▌  | 3/4 [00:11<00:03,  3.54s/ba]
inference-nlp-azure-1  | 100%|██████████| 4/4 [00:14<00:00,  3.31s/ba]
inference-nlp-azure-1  | 100%|██████████| 4/4 [00:14<00:00,  3.58s/ba]
inference-nlp-azure-1  | 
inference-nlp-azure-1  |   0%|          | 0/2 [00:00<?, ?ba/s]
inference-nlp-azure-1  |  50%|█████     | 1/2 [00:03<00:03,  3.01s/ba]
inference-nlp-azure-1  | 100%|██████████| 2/2 [00:05<00:00,  2.98s/ba]
inference-nlp-azure-1  | 100%|██████████| 2/2 [00:05<00:00,  2.98s/ba]
inference-nlp-azure-1  | 2022-10-18 21:39:50 [INFO] Created a worker pool for first use
inference-nlp-azure-1  | 2022-10-18 21:39:50 [WARNING] Reusing dataset glue (/root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
inference-nlp-azure-1  | 
inference-nlp-azure-1  |   0%|          | 0/2 [00:00<?, ?it/s]
inference-nlp-azure-1  | 100%|██████████| 2/2 [00:00<00:00, 662.66it/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  |   0%|          | 0/4 [00:00<?, ?ba/s]loading configuration file from cache at /root/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e
inference-nlp-azure-1  | Model config BertConfig {
inference-nlp-azure-1  |   "_name_or_path": "bert-base-uncased",
inference-nlp-azure-1  |   "architectures": [
inference-nlp-azure-1  |     "BertForMaskedLM"
inference-nlp-azure-1  |   ],
inference-nlp-azure-1  |   "attention_probs_dropout_prob": 0.1,
inference-nlp-azure-1  |   "classifier_dropout": null,
inference-nlp-azure-1  |   "gradient_checkpointing": false,
inference-nlp-azure-1  |   "hidden_act": "gelu",
inference-nlp-azure-1  |   "hidden_dropout_prob": 0.1,
inference-nlp-azure-1  |   "hidden_size": 768,
inference-nlp-azure-1  |   "initializer_range": 0.02,
inference-nlp-azure-1  |   "intermediate_size": 3072,
inference-nlp-azure-1  |   "layer_norm_eps": 1e-12,
inference-nlp-azure-1  |   "max_position_embeddings": 512,
inference-nlp-azure-1  |   "model_type": "bert",
inference-nlp-azure-1  |   "num_attention_heads": 12,
inference-nlp-azure-1  |   "num_hidden_layers": 12,
inference-nlp-azure-1  |   "pad_token_id": 0,
inference-nlp-azure-1  |   "position_embedding_type": "absolute",
inference-nlp-azure-1  |   "transformers_version": "4.21.1",
inference-nlp-azure-1  |   "type_vocab_size": 2,
inference-nlp-azure-1  |   "use_cache": true,
inference-nlp-azure-1  |   "vocab_size": 30522
inference-nlp-azure-1  | 
inference-nlp-azure-1  | 100%|██████████| 4/4 [00:11<00:00,  2.93s/ba]
inference-nlp-azure-1  | 100%|██████████| 4/4 [00:11<00:00,  2.93s/ba]
inference-nlp-azure-1  | 2022-10-18 21:40:03 [INFO] Pass query framework capability elapsed time: 554.0 ms
inference-nlp-azure-1  | 2022-10-18 21:40:03 [INFO] Get FP32 model baseline.
inference-nlp-azure-1  | The following columns in the evaluation set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.
inference-nlp-azure-1  | /opt/miniconda/lib/python3.8/site-packages/intel_extension_for_pytorch/ UserWarning: Conv BatchNorm folding failed during the optimize process.
inference-nlp-azure-1  |   warnings.warn("Conv BatchNorm folding failed during the optimize process.")
inference-nlp-azure-1  | ***** Running Evaluation *****
inference-nlp-azure-1  |   Num examples = 1725
inference-nlp-azure-1  |   Batch size = 8
inference-nlp-azure-1  | 
inference-nlp-azure-1  |   0%|          | 0/216 [00:00<?, ?it/s]
inference-nlp-azure-1  |   1%|          | 2/216 [00:00<01:35,  2.25it/s]
inference-nlp-azure-1  |   1%|▏         | 3/216 [00:01<02:16,  1.56it/s]
inference-nlp-azure-1  |   2%|▏         | 4/216 [00:02<02:30,  1.41it/s]
inference-nlp-azure-1  |   2%|▏         | 5/216 [00:03<02:39,  1.32it/s]
inference-nlp-azure-1  |   3%|▎         | 6/216 [00:04<02:44,  1.28it/s]
inference-nlp-azure-1  |   3%|▎         | 7/216 [00:05<02:45,  1.26it/s]
inference-nlp-azure-1  |   4%|▎         | 8/216 [00:05<02:45,  1.25it/s]
inference-nlp-azure-1  |   4%|▍         | 9/216 [00:06<02:44,  1.26it/s]
inference-nlp-azure-1  |   5%|▍         | 10/216 [00:07<02:44,  1.25it/s]
inference-nlp-azure-1  |   5%|▌         | 11/216 [00:08<02:44,  1.25it/s]
inference-nlp-azure-1  |   6%|▌         | 12/216 [00:09<02:47,  1.22it/s]
inference-nlp-azure-1  |   6%|▌         | 13/216 [00:10<02:46,  1.22it/s]
inference-nlp-azure-1  |   6%|▋         | 14/216 [00:10<02:45,  1.22it/s]
inference-nlp-azure-1  |   7%|▋         | 15/216 [00:11<02:43,  1.23it/s]
inference-nlp-azure-1  |   7%|▋         | 16/216 [00:12<02:43,  1.22it/s]
inference-nlp-azure-1  |   8%|▊         | 17/216 [00:13<02:42,  1.23it/s]
inference-nlp-azure-1  |   8%|▊         | 18/216 [00:14<02:40,  1.23it/s]
inference-nlp-azure-1  |   9%|▉         | 19/216 [00:14<02:41,  1.22it/s]
inference-nlp-azure-1  |   9%|▉         | 20/216 [00:15<02:39,  1.23it/s]
inference-nlp-azure-1  |  10%|▉         | 21/216 [00:16<02:37,  1.24it/s]
inference-nlp-azure-1  |  10%|█         | 22/216 [00:17<02:37,  1.23it/s]
inference-nlp-azure-1  |  11%|█         | 23/216 [00:18<02:37,  1.23it/s]
inference-nlp-azure-1  |  11%|█         | 24/216 [00:18<02:35,  1.23it/s]
inference-nlp-azure-1  |  12%|█▏        | 25/216 [00:19<02:35,  1.23it/s]
inference-nlp-azure-1  |  12%|█▏        | 26/216 [00:20<02:34,  1.23it/s]
inference-nlp-azure-1  |  12%|█▎        | 27/216 [00:21<02:34,  1.22it/s]
inference-nlp-azure-1  |  13%|█▎        | 28/216 [00:22<02:32,  1.23it/s]
inference-nlp-azure-1  |  13%|█▎        | 29/216 [00:23<02:31,  1.23it/s]
inference-nlp-azure-1  |  14%|█▍        | 30/216 [00:23<02:32,  1.22it/s]
inference-nlp-azure-1  |  14%|█▍        | 31/216 [00:24<02:37,  1.17it/s]
inference-nlp-azure-1  |  15%|█▍        | 32/216 [00:25<02:33,  1.20it/s]
inference-nlp-azure-1  |  15%|█▌        | 33/216 [00:26<02:30,  1.21it/s]
inference-nlp-azure-1  |  16%|█▌        | 34/216 [00:27<02:27,  1.23it/s]
inference-nlp-azure-1  |  16%|█▌        | 35/216 [00:27<02:26,  1.24it/s]
inference-nlp-azure-1  |  17%|█▋        | 36/216 [00:28<02:25,  1.24it/s]
inference-nlp-azure-1  |  17%|█▋        | 37/216 [00:29<02:24,  1.24it/s]
inference-nlp-azure-1  |  18%|█▊        | 38/216 [00:30<02:27,  1.21it/s]
inference-nlp-azure-1  |  18%|█▊        | 39/216 [00:31<02:24,  1.23it/s]
inference-nlp-azure-1  |  19%|█▊        | 40/216 [00:32<02:23,  1.23it/s]
inference-nlp-azure-1  |  19%|█▉        | 41/216 [00:32<02:22,  1.23it/s]
inference-nlp-azure-1  |  19%|█▉        | 42/216 [00:33<02:21,  1.23it/s]
inference-nlp-azure-1  |  20%|█▉        | 43/216 [00:34<02:19,  1.24it/s]
inference-nlp-azure-1  |  20%|██        | 44/216 [00:35<02:18,  1.24it/s]
inference-nlp-azure-1  |  21%|██        | 45/216 [00:36<02:17,  1.24it/s]
inference-nlp-azure-1  |  21%|██▏       | 46/216 [00:36<02:18,  1.23it/s]
inference-nlp-azure-1  |  22%|██▏       | 47/216 [00:37<02:20,  1.21it/s]
inference-nlp-azure-1  |  22%|██▏       | 48/216 [00:38<02:19,  1.21it/s]
inference-nlp-azure-1  |  23%|██▎       | 49/216 [00:39<02:19,  1.19it/s]
inference-nlp-azure-1  |  23%|██▎       | 50/216 [00:40<02:16,  1.21it/s]
inference-nlp-azure-1  |  24%|██▎       | 51/216 [00:41<02:16,  1.21it/s]
inference-nlp-azure-1  |  24%|██▍       | 52/216 [00:41<02:14,  1.22it/s]
inference-nlp-azure-1  |  25%|██▍       | 53/216 [00:42<02:12,  1.23it/s]
inference-nlp-azure-1  |  25%|██▌       | 54/216 [00:43<02:11,  1.23it/s]
inference-nlp-azure-1  |  25%|██▌       | 55/216 [00:44<02:10,  1.24it/s]
inference-nlp-azure-1  |  26%|██▌       | 56/216 [00:45<02:08,  1.24it/s]
inference-nlp-azure-1  |  26%|██▋       | 57/216 [00:45<02:09,  1.23it/s]
inference-nlp-azure-1  |  27%|██▋       | 58/216 [00:46<02:08,  1.23it/s]
inference-nlp-azure-1  |  27%|██▋       | 59/216 [00:47<02:06,  1.24it/s]
inference-nlp-azure-1  |  28%|██▊       | 60/216 [00:48<02:07,  1.22it/s]
inference-nlp-azure-1  |  28%|██▊       | 61/216 [00:49<02:06,  1.23it/s]
inference-nlp-azure-1  |  29%|██▊       | 62/216 [00:49<02:04,  1.24it/s]
inference-nlp-azure-1  |  29%|██▉       | 63/216 [00:50<02:04,  1.23it/s]
inference-nlp-azure-1  |  30%|██▉       | 64/216 [00:51<02:03,  1.23it/s]
inference-nlp-azure-1  |  30%|███       | 65/216 [00:52<02:02,  1.23it/s]
inference-nlp-azure-1  |  31%|███       | 66/216 [00:53<02:00,  1.24it/s]
inference-nlp-azure-1  |  31%|███       | 67/216 [00:53<01:59,  1.25it/s]
inference-nlp-azure-1  |  31%|███▏      | 68/216 [00:54<02:02,  1.21it/s]
inference-nlp-azure-1  |  32%|███▏      | 69/216 [00:55<02:00,  1.22it/s]
inference-nlp-azure-1  |  32%|███▏      | 70/216 [00:56<01:59,  1.23it/s]
inference-nlp-azure-1  |  33%|███▎      | 71/216 [00:57<01:57,  1.24it/s]
inference-nlp-azure-1  |  33%|███▎      | 72/216 [00:58<01:56,  1.24it/s]
inference-nlp-azure-1  |  34%|███▍      | 73/216 [00:58<01:55,  1.23it/s]
inference-nlp-azure-1  |  34%|███▍      | 74/216 [00:59<01:54,  1.24it/s]
inference-nlp-azure-1  |  35%|███▍      | 75/216 [01:00<01:54,  1.24it/s]
inference-nlp-azure-1  |  35%|███▌      | 76/216 [01:01<01:54,  1.23it/s]
inference-nlp-azure-1  |  36%|███▌      | 77/216 [01:02<01:53,  1.22it/s]
inference-nlp-azure-1  |  36%|███▌      | 78/216 [01:02<01:52,  1.22it/s]
inference-nlp-azure-1  |  37%|███▋      | 79/216 [01:03<01:51,  1.23it/s]
inference-nlp-azure-1  |  37%|███▋      | 80/216 [01:04<01:50,  1.23it/s]
inference-nlp-azure-1  |  38%|███▊      | 81/216 [01:05<01:48,  1.24it/s]
inference-nlp-azure-1  |  38%|███▊      | 82/216 [01:06<01:49,  1.22it/s]
inference-nlp-azure-1  |  38%|███▊      | 83/216 [01:07<01:50,  1.20it/s]
inference-nlp-azure-1  |  39%|███▉      | 84/216 [01:07<01:48,  1.22it/s]
inference-nlp-azure-1  |  39%|███▉      | 85/216 [01:08<01:47,  1.22it/s]
inference-nlp-azure-1  |  40%|███▉      | 86/216 [01:09<01:51,  1.16it/s]
inference-nlp-azure-1  |  40%|████      | 87/216 [01:10<01:49,  1.18it/s]
inference-nlp-azure-1  |  41%|████      | 88/216 [01:11<01:46,  1.20it/s]
inference-nlp-azure-1  |  41%|████      | 89/216 [01:12<01:44,  1.21it/s]
inference-nlp-azure-1  |  42%|████▏     | 90/216 [01:12<01:43,  1.21it/s]
inference-nlp-azure-1  |  42%|████▏     | 91/216 [01:13<01:41,  1.23it/s]
inference-nlp-azure-1  |  43%|████▎     | 92/216 [01:14<01:41,  1.22it/s]
inference-nlp-azure-1  |  43%|████▎     | 93/216 [01:15<01:42,  1.20it/s]
inference-nlp-azure-1  |  44%|████▎     | 94/216 [01:16<01:40,  1.21it/s]
inference-nlp-azure-1  |  44%|████▍     | 95/216 [01:17<01:39,  1.22it/s]
inference-nlp-azure-1  |  44%|████▍     | 96/216 [01:17<01:38,  1.22it/s]
inference-nlp-azure-1  |  45%|████▍     | 97/216 [01:18<01:36,  1.23it/s]
inference-nlp-azure-1  |  45%|████▌     | 98/216 [01:19<01:35,  1.24it/s]
inference-nlp-azure-1  |  46%|████▌     | 99/216 [01:20<01:33,  1.25it/s]
inference-nlp-azure-1  |  46%|████▋     | 100/216 [01:21<01:32,  1.25it/s]
inference-nlp-azure-1  |  47%|████▋     | 101/216 [01:21<01:32,  1.24it/s]
inference-nlp-azure-1  |  47%|████▋     | 102/216 [01:22<01:32,  1.23it/s]
inference-nlp-azure-1  |  48%|████▊     | 103/216 [01:23<01:31,  1.23it/s]
inference-nlp-azure-1  |  48%|████▊     | 104/216 [01:24<01:35,  1.17it/s]
inference-nlp-azure-1  |  49%|████▊     | 105/216 [01:25<01:33,  1.19it/s]
inference-nlp-azure-1  |  49%|████▉     | 106/216 [01:26<01:31,  1.20it/s]
inference-nlp-azure-1  |  50%|████▉     | 107/216 [01:26<01:29,  1.22it/s]
inference-nlp-azure-1  |  50%|█████     | 108/216 [01:27<01:28,  1.21it/s]
inference-nlp-azure-1  |  50%|█████     | 109/216 [01:28<01:27,  1.23it/s]
inference-nlp-azure-1  |  51%|█████     | 110/216 [01:29<01:26,  1.23it/s]
inference-nlp-azure-1  |  51%|█████▏    | 111/216 [01:30<01:25,  1.23it/s]
inference-nlp-azure-1  |  52%|█████▏    | 112/216 [01:30<01:23,  1.24it/s]
inference-nlp-azure-1  |  52%|█████▏    | 113/216 [01:31<01:21,  1.26it/s]
inference-nlp-azure-1  |  53%|█████▎    | 114/216 [01:32<01:21,  1.25it/s]
inference-nlp-azure-1  |  53%|█████▎    | 115/216 [01:33<01:20,  1.26it/s]
inference-nlp-azure-1  |  54%|█████▎    | 116/216 [01:34<01:20,  1.25it/s]
inference-nlp-azure-1  |  54%|█████▍    | 117/216 [01:34<01:19,  1.25it/s]
inference-nlp-azure-1  |  55%|█████▍    | 118/216 [01:35<01:18,  1.25it/s]
inference-nlp-azure-1  |  55%|█████▌    | 119/216 [01:36<01:18,  1.23it/s]
inference-nlp-azure-1  |  56%|█████▌    | 120/216 [01:37<01:18,  1.23it/s]
inference-nlp-azure-1  |  56%|█████▌    | 121/216 [01:38<01:18,  1.21it/s]
inference-nlp-azure-1  |  56%|█████▋    | 122/216 [01:38<01:16,  1.22it/s]
inference-nlp-azure-1  |  57%|█████▋    | 123/216 [01:39<01:18,  1.19it/s]
inference-nlp-azure-1  |  57%|█████▋    | 124/216 [01:40<01:16,  1.21it/s]
inference-nlp-azure-1  |  58%|█████▊    | 125/216 [01:41<01:15,  1.21it/s]
inference-nlp-azure-1  |  58%|█████▊    | 126/216 [01:42<01:15,  1.20it/s]
inference-nlp-azure-1  |  59%|█████▉    | 127/216 [01:43<01:14,  1.19it/s]
inference-nlp-azure-1  |  59%|█████▉    | 128/216 [01:44<01:13,  1.19it/s]
inference-nlp-azure-1  |  60%|█████▉    | 129/216 [01:44<01:12,  1.21it/s]
inference-nlp-azure-1  |  60%|██████    | 130/216 [01:45<01:11,  1.20it/s]
inference-nlp-azure-1  |  61%|██████    | 131/216 [01:46<01:11,  1.19it/s]
inference-nlp-azure-1  |  61%|██████    | 132/216 [01:47<01:09,  1.22it/s]
inference-nlp-azure-1  |  62%|██████▏   | 133/216 [01:48<01:08,  1.21it/s]
inference-nlp-azure-1  |  62%|██████▏   | 134/216 [01:48<01:07,  1.22it/s]
inference-nlp-azure-1  |  62%|██████▎   | 135/216 [01:49<01:05,  1.24it/s]
inference-nlp-azure-1  |  63%|██████▎   | 136/216 [01:50<01:04,  1.24it/s]
inference-nlp-azure-1  |  63%|██████▎   | 137/216 [01:51<01:03,  1.25it/s]
inference-nlp-azure-1  |  64%|██████▍   | 138/216 [01:52<01:02,  1.25it/s]
inference-nlp-azure-1  |  64%|██████▍   | 139/216 [01:52<01:01,  1.25it/s]
inference-nlp-azure-1  |  65%|██████▍   | 140/216 [01:53<01:00,  1.26it/s]
inference-nlp-azure-1  |  65%|██████▌   | 141/216 [01:54<01:03,  1.19it/s]
inference-nlp-azure-1  |  66%|██████▌   | 142/216 [01:55<01:01,  1.20it/s]
inference-nlp-azure-1  |  66%|██████▌   | 143/216 [01:56<01:00,  1.20it/s]
inference-nlp-azure-1  |  67%|██████▋   | 144/216 [01:57<00:59,  1.21it/s]
inference-nlp-azure-1  |  67%|██████▋   | 145/216 [01:57<00:58,  1.21it/s]
inference-nlp-azure-1  |  68%|██████▊   | 146/216 [01:58<00:57,  1.21it/s]
inference-nlp-azure-1  |  68%|██████▊   | 147/216 [01:59<00:56,  1.22it/s]
inference-nlp-azure-1  |  69%|██████▊   | 148/216 [02:00<00:55,  1.22it/s]
inference-nlp-azure-1  |  69%|██████▉   | 149/216 [02:01<00:54,  1.22it/s]
inference-nlp-azure-1  |  69%|██████▉   | 150/216 [02:02<00:53,  1.23it/s]
inference-nlp-azure-1  |  70%|██████▉   | 151/216 [02:02<00:52,  1.23it/s]
inference-nlp-azure-1  |  70%|███████   | 152/216 [02:03<00:52,  1.23it/s]
inference-nlp-azure-1  |  71%|███████   | 153/216 [02:04<00:50,  1.24it/s]
inference-nlp-azure-1  |  71%|███████▏  | 154/216 [02:05<00:50,  1.24it/s]
inference-nlp-azure-1  |  72%|███████▏  | 155/216 [02:06<00:49,  1.22it/s]
inference-nlp-azure-1  |  72%|███████▏  | 156/216 [02:06<00:48,  1.23it/s]
inference-nlp-azure-1  |  73%|███████▎  | 157/216 [02:07<00:48,  1.22it/s]
inference-nlp-azure-1  |  73%|███████▎  | 158/216 [02:08<00:47,  1.22it/s]
inference-nlp-azure-1  |  74%|███████▎  | 159/216 [02:09<00:47,  1.20it/s]
inference-nlp-azure-1  |  74%|███████▍  | 160/216 [02:10<00:46,  1.20it/s]
inference-nlp-azure-1  |  75%|███████▍  | 161/216 [02:11<00:45,  1.22it/s]
inference-nlp-azure-1  |  75%|███████▌  | 162/216 [02:11<00:43,  1.23it/s]
inference-nlp-azure-1  |  75%|███████▌  | 163/216 [02:12<00:43,  1.22it/s]
inference-nlp-azure-1  |  76%|███████▌  | 164/216 [02:13<00:42,  1.21it/s]
inference-nlp-azure-1  |  76%|███████▋  | 165/216 [02:14<00:41,  1.22it/s]
inference-nlp-azure-1  |  77%|███████▋  | 166/216 [02:15<00:40,  1.22it/s]
inference-nlp-azure-1  |  77%|███████▋  | 167/216 [02:15<00:40,  1.22it/s]
inference-nlp-azure-1  |  78%|███████▊  | 168/216 [02:16<00:39,  1.21it/s]
inference-nlp-azure-1  |  78%|███████▊  | 169/216 [02:17<00:38,  1.21it/s]
inference-nlp-azure-1  |  79%|███████▊  | 170/216 [02:18<00:37,  1.22it/s]
inference-nlp-azure-1  |  79%|███████▉  | 171/216 [02:19<00:37,  1.20it/s]
inference-nlp-azure-1  |  80%|███████▉  | 172/216 [02:20<00:36,  1.21it/s]
inference-nlp-azure-1  |  80%|████████  | 173/216 [02:20<00:35,  1.21it/s]
inference-nlp-azure-1  |  81%|████████  | 174/216 [02:21<00:34,  1.23it/s]
inference-nlp-azure-1  |  81%|████████  | 175/216 [02:22<00:33,  1.24it/s]
inference-nlp-azure-1  |  81%|████████▏ | 176/216 [02:23<00:32,  1.24it/s]
inference-nlp-azure-1  |  82%|████████▏ | 177/216 [02:24<00:32,  1.19it/s]
inference-nlp-azure-1  |  82%|████████▏ | 178/216 [02:25<00:31,  1.20it/s]
inference-nlp-azure-1  |  83%|████████▎ | 179/216 [02:25<00:30,  1.22it/s]
inference-nlp-azure-1  |  83%|████████▎ | 180/216 [02:26<00:29,  1.24it/s]
inference-nlp-azure-1  |  84%|████████▍ | 181/216 [02:27<00:28,  1.24it/s]
inference-nlp-azure-1  |  84%|████████▍ | 182/216 [02:28<00:27,  1.24it/s]
inference-nlp-azure-1  |  85%|████████▍ | 183/216 [02:29<00:26,  1.24it/s]
inference-nlp-azure-1  |  85%|████████▌ | 184/216 [02:29<00:25,  1.24it/s]
inference-nlp-azure-1  |  86%|████████▌ | 185/216 [02:30<00:24,  1.25it/s]
inference-nlp-azure-1  |  86%|████████▌ | 186/216 [02:31<00:24,  1.24it/s]
inference-nlp-azure-1  |  87%|████████▋ | 187/216 [02:32<00:23,  1.25it/s]
inference-nlp-azure-1  |  87%|████████▋ | 188/216 [02:32<00:22,  1.25it/s]
inference-nlp-azure-1  |  88%|████████▊ | 189/216 [02:33<00:21,  1.24it/s]
inference-nlp-azure-1  |  88%|████████▊ | 190/216 [02:34<00:21,  1.24it/s]
inference-nlp-azure-1  |  88%|████████▊ | 191/216 [02:35<00:20,  1.23it/s]
inference-nlp-azure-1  |  89%|████████▉ | 192/216 [02:36<00:19,  1.24it/s]
inference-nlp-azure-1  |  89%|████████▉ | 193/216 [02:37<00:18,  1.23it/s]
inference-nlp-azure-1  |  90%|████████▉ | 194/216 [02:37<00:17,  1.24it/s]
inference-nlp-azure-1  |  90%|█████████ | 195/216 [02:38<00:16,  1.24it/s]
inference-nlp-azure-1  |  91%|█████████ | 196/216 [02:39<00:16,  1.21it/s]
inference-nlp-azure-1  |  91%|█████████ | 197/216 [02:40<00:15,  1.21it/s]
inference-nlp-azure-1  |  92%|█████████▏| 198/216 [02:41<00:14,  1.20it/s]
inference-nlp-azure-1  |  92%|█████████▏| 199/216 [02:42<00:13,  1.22it/s]
inference-nlp-azure-1  |  93%|█████████▎| 200/216 [02:42<00:13,  1.22it/s]
inference-nlp-azure-1  |  93%|█████████▎| 201/216 [02:43<00:12,  1.23it/s]
inference-nlp-azure-1  |  94%|█████████▎| 202/216 [02:44<00:11,  1.23it/s]
inference-nlp-azure-1  |  94%|█████████▍| 203/216 [02:45<00:10,  1.21it/s]
inference-nlp-azure-1  |  94%|█████████▍| 204/216 [02:46<00:09,  1.20it/s]
inference-nlp-azure-1  |  95%|█████████▍| 205/216 [02:46<00:09,  1.21it/s]
inference-nlp-azure-1  |  95%|█████████▌| 206/216 [02:47<00:08,  1.22it/s]
inference-nlp-azure-1  |  96%|█████████▌| 207/216 [02:48<00:07,  1.23it/s]
inference-nlp-azure-1  |  96%|█████████▋| 208/216 [02:49<00:06,  1.24it/s]
inference-nlp-azure-1  |  97%|█████████▋| 209/216 [02:50<00:05,  1.24it/s]
inference-nlp-azure-1  |  97%|█████████▋| 210/216 [02:50<00:04,  1.24it/s]
inference-nlp-azure-1  |  98%|█████████▊| 211/216 [02:51<00:03,  1.25it/s]
inference-nlp-azure-1  |  98%|█████████▊| 212/216 [02:52<00:03,  1.25it/s]
inference-nlp-azure-1  |  99%|█████████▊| 213/216 [02:53<00:02,  1.25it/s]
inference-nlp-azure-1  |  99%|█████████▉| 214/216 [02:54<00:01,  1.23it/s]
inference-nlp-azure-1  | 100%|█████████▉| 215/216 [02:55<00:00,  1.21it/s]
inference-nlp-azure-1  | 100%|██████████| 216/216 [02:55<00:00,  1.36it/s]
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Downloading builder script:   0%|          | 0.00/1.65k [00:00<?, ?B/s]�[A
inference-nlp-azure-1  | Downloading builder script: 4.21kB [00:00, 3.90MB/s]                   
inference-nlp-azure-1  | 
inference-nlp-azure-1  | 100%|██████████| 216/216 [02:55<00:00,  1.23it/s]
inference-nlp-azure-1  | 2022-10-18 21:43:00 [INFO] Save tuning history to /mnt/azureml/cr/j/e4712a572fab403692800d480981321b/exe/wd/nc_workspace/2022-10-18_21-39-23/./history.snapshot.
inference-nlp-azure-1  | 2022-10-18 21:43:00 [INFO] FP32 baseline is: [Accuracy: 0.8394, Duration (seconds): 177.0837]
inference-nlp-azure-1  | /opt/miniconda/lib/python3.8/site-packages/torch/ao/quantization/ UserWarning: QConfigDynamic is going to be deprecated in PyTorch 1.12, please use QConfig instead
inference-nlp-azure-1  |   warnings.warn("QConfigDynamic is going to be deprecated in PyTorch 1.12, please use QConfig instead")
inference-nlp-azure-1  | 2022-10-18 21:43:00 [INFO] Fx trace of the entire model failed, We will conduct auto quantization
inference-nlp-azure-1  | /opt/miniconda/lib/python3.8/site-packages/torch/ao/quantization/ UserWarning: Please use quant_min and quant_max to specify the range for observers.                     reduce_range will be deprecated in a future release of PyTorch.
inference-nlp-azure-1  |   warnings.warn(
inference-nlp-azure-1  | /opt/miniconda/lib/python3.8/site-packages/torch/nn/quantized/_reference/modules/ UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
inference-nlp-azure-1  |   torch.tensor(weight_qparams["scale"], dtype=torch.float, device=device))
inference-nlp-azure-1  | /opt/miniconda/lib/python3.8/site-packages/torch/nn/quantized/_reference/modules/ UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
inference-nlp-azure-1  |   torch.tensor(weight_qparams["zero_point"], dtype=zero_point_dtype, device=device))
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |*********Mixed Precision Statistics********|
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] +---------------------+-------+------+------+
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |       Op Type       | Total | INT8 | FP32 |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] +---------------------+-------+------+------+
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |      Embedding      |   3   |  3   |  0   |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |      LayerNorm      |   25  |  0   |  25  |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] | quantize_per_tensor |   74  |  74  |  0   |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |        Linear       |   74  |  74  |  0   |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |      dequantize     |   74  |  74  |  0   |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |     input_tensor    |   24  |  24  |  0   |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] |       Dropout       |   24  |  0   |  24  |
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] +---------------------+-------+------+------+
inference-nlp-azure-1  | 2022-10-18 21:43:30 [INFO] Pass quantize model elapsed time: 30514.29 ms
inference-nlp-azure-1  | The following columns in the evaluation set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: sentence2, idx, sentence1. If sentence2, idx, sentence1 are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.
inference-nlp-azure-1  | /opt/miniconda/lib/python3.8/site-packages/intel_extension_for_pytorch/ UserWarning: Conv BatchNorm folding failed during the optimize process.
inference-nlp-azure-1  |   warnings.warn("Conv BatchNorm folding failed during the optimize process.")


inference-nlp-azure-1  | tokenizer config file saved in ./outputs/tokenizer_config.json
inference-nlp-azure-1  | Special tokens file saved in ./outputs/special_tokens_map.json
inference-nlp-azure-1  | Configuration saved in ./outputs/config.json
inference-nlp-azure-1  | Convertion complete!
inference-nlp-azure-1  | Cleaning up all outstanding Run operations, waiting 300.0 seconds
inference-nlp-azure-1  | 1 items cleaning up...
inference-nlp-azure-1  | Cleanup took 0.050061702728271484 seconds
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Execution Summary
inference-nlp-azure-1  | =================
inference-nlp-azure-1  | RunId: INC_PTQ_1666128985_788b95f3
inference-nlp-azure-1  | Web View:
inference-nlp-azure-1  | 
inference-nlp-azure-1  | Registering model inc_ptq_bert_model_mrpc
inference-nlp-azure-1  | Found existing cluster, use it.
inference-nlp-azure-1  | Service hf-aks-1
inference-nlp-azure-1  | Tips: You can try get_logs(): or local deployment: to debug if deployment takes longer than 10 minutes.
inference-nlp-azure-1  | Running
inference-nlp-azure-1  | 2022-10-19 02:44:14+00:00 Creating Container Registry if not exists.
inference-nlp-azure-1  | 2022-10-19 02:44:14+00:00 Registering the environment.
inference-nlp-azure-1  | 2022-10-19 02:44:15+00:00 Use the existing image.
inference-nlp-azure-1  | 2022-10-19 02:44:17+00:00 Creating resources in AKS.
inference-nlp-azure-1  | 2022-10-19 02:44:18+00:00 Submitting deployment to compute.
inference-nlp-azure-1  | 2022-10-19 02:44:18+00:00 Checking the status of deployment hf-aks-1..
inference-nlp-azure-1  | 2022-10-19 02:45:01+00:00 Checking the status of inference endpoint hf-aks-1.
inference-nlp-azure-1  | Succeeded
inference-nlp-azure-1  | AKS service creation operation finished, operation "Succeeded"
inference-nlp-azure-1  | Healthy
inference-nlp-azure-1  | {'result': '0', 'sentence1': 'Shares of Genentech, a much larger company with several products on the market, rose more than 2 percent.', 'sentence2': 'Shares of Xoma fell 16 percent in early trade, while shares of Genentech, a much larger company with several products on the market, were up 2 percent.', 'logits': 'tensor([[ 2.3388, -2.3361]], grad_fn=<AddmmBackward0>)', 'probability': 'tensor([0.9908, 0.0092], grad_fn=<SoftmaxBackward0>)', 'input_data': "{'input_ids': tensor([[  101,  6661,  1997,  4962, 10111,  2818,  1010,  1037,  2172,  3469,\n          2194,  2007,  2195,  3688,  2006,  1996,  3006,  1010,  3123,  2062,\n          2084,  1016,  3867,  1012,   102,  6661,  1997,  1060,  9626,  3062,\n          2385,  3867,  1999,  2220,  3119,  1010,  2096,  6661,  1997,  4962,\n         10111,  2818,  1010,  1037,  2172,  3469,  2194,  2007,  2195,  3688,\n          2006,  1996,  3006,  1010,  2020,  2039,  1016,  3867,  1012,   102]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,\n         0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\n         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\n         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\n         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}", 'model_path': '/var/azureml-app/azureml-models/inc_ptq_bert_model_mrpc/2/outputs'}
inference-nlp-azure-1  | Classification result: 0
inference-nlp-azure-1 exited with code 0