Fixes after review

Change model original framework to PyTorch Add model source to intro cell table Expand transformers.onnx CLI tool description to align it with other tool Delete Inference Engine mentions from autogenerated cells
openvinotoolkit · apaniukov · Mar 29, 2022 · Mar 29, 2022 · Mar 29, 2022 · Mar 30, 2022
commit 3d4e623b0e156f38eac9782a57d4f4b17a872de9
diff --git a/automation/bom/image_BOM.txt b/automation/bom/image_BOM.txt
@@ -508,6 +508,7 @@ wb/main/jupyter_notebooks/cell_templates/tokenize_dataset_docs_cell.jinja
 wb/main/jupyter_notebooks/cell_templates/tokenizer_parameters_code_cell.jinja
 wb/main/jupyter_notebooks/cell_templates/transformers_onnx_converter_code_cell.jinja
 wb/main/jupyter_notebooks/cell_templates/transformers_onnx_converter_docs_cell.jinja
+wb/main/jupyter_notebooks/cell_templates/transformers_onnx_converter_result_docs_cell.jinja
 wb/main/jupyter_notebooks/cell_templates/validate_ir_model_code_cell.jinja
 wb/main/jupyter_notebooks/cell_templates/validate_ir_model_docs_cell.jinja
 wb/main/jupyter_notebooks/cli_tools_options.py

diff --git a/wb/main/enumerates.py b/wb/main/enumerates.py
@@ -285,6 +285,16 @@ class ModelSourceEnum(enum.Enum):
     ir = 'ir'
     huggingface = 'huggingface'
 
+    def get_name(self) -> str:
+        if self.value == "omz":
+            return "OMZ"
+        if self.value == "original":
+            return "Original"
+        if self.value == "ir":
+            return "IR"
+        if self.value == "huggingface":
+            return "Hugging Face Hub"
+
 
 class TargetOSEnum(enum.Enum):
     ubuntu18 = 'ubuntu18'

diff --git a/wb/main/jupyter_notebooks/cell_template_contexts.py b/wb/main/jupyter_notebooks/cell_template_contexts.py
@@ -31,6 +31,7 @@ class IntroCellTemplateContext(TypedDict):
     project_model_task_type: str
     project_model_framework: str
     project_model_precisions: str
+    project_model_source: str
     has_tokenizer_section: bool
     has_accuracy_checker_section: bool
     has_int8_calibration_section: bool
@@ -79,6 +80,10 @@ class ProfilingCodeCellTemplateContext(PythonToolCodeCellTemplateContext):
     has_tokenizer_section: bool
 
 
+class ProfilingDocsCellTemplateContext(PythonToolCodeCellTemplateContext):
+    is_nlp: bool
+
+
 class TokenizerParametersTemplateContext(TypedDict):
     tokenizer_path: Optional[str]
     dataset_path: str

diff --git a/wb/main/jupyter_notebooks/cell_templates/intro_docs_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/intro_docs_cell.jinja
@@ -4,9 +4,9 @@
 
 The purpose of this tutorial is to guide you through the stages of working with a model to optimize it and prepare for production using OpenVINO toolkit. The model used in the tutorial was imported from the DL Workbench project and has the following characteristics:
 
-| Model Name | Domain |  Task Type | Framework | Precisions |
-| :---: | :---: | :---: | :---: | :---: |
-| {{ project_model_name }} | {{ project_model_domain }} | {{ project_model_task_type | replace('_', ' ') | title }} | {{ SupportedFrameworksEnum.get_name(project_model_framework) }} | {{ project_model_precisions }} |
+| Model Name | Domain |  Task Type | Framework | Precisions | Source |
+| :---: | :---: | :---: | :---: | :---: | :---: |
+| {{ project_model_name }} | {{ project_model_domain }} | {{ project_model_task_type | replace('_', ' ') | title }} | {{ SupportedFrameworksEnum.get_name(project_model_framework) }} | {{ project_model_precisions }} | {{ project_model_source }} |
 
 {% if project_model_task_type == TaskEnum.object_detection.value -%}
     This model is trained to solve Object Detection task. The goal of Object Detection is to recognize instances of object classes (for example: people, cars, animals) and describe the locations of each detected object in the image using a bounding box.

diff --git a/wb/main/jupyter_notebooks/cell_templates/model_optimizer_generic_arguments_docs_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/model_optimizer_generic_arguments_docs_cell.jinja
@@ -23,7 +23,7 @@
 `--silent` | Prevent any output messages except those that correspond to log level equals ERROR, that can be set with the following option: --log_level. By default, log level is already ERROR.
 `--freeze_placeholder_with_value` | Replaces input layer with constant node with provided value, for example: "node_name->True". It will be DEPRECATED in future releases. Use --input option to specify a value for freezing.
 `--generate_deprecated_IR_V7` | Force to generate deprecated IR V7 with layers from old IR specification.
-`--static_shape` | Enables IR generation for fixed input shape (folding `ShapeOf` operations and shape-calculating sub-graphs to `Constant`). Changing model input shape using the Inference Engine API in runtime may fail for such an IR.
+`--static_shape` | Enables IR generation for fixed input shape (folding `ShapeOf` operations and shape-calculating sub-graphs to `Constant`). Changing model input shape using the OpenVINO API in runtime may fail for such an IR.
 `--keep_shape_ops` | The option is ignored. Expected behavior is enabled by default.
 `--disable_weights_compression` | Disable compression and store weights with original precision.
 `--progress` | Enable model conversion progress display.

diff --git a/wb/main/jupyter_notebooks/cell_templates/obtain_model_docs_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/obtain_model_docs_cell.jinja
@@ -22,7 +22,12 @@ Open Model Zoo
 User provided model
 {% endif %}
 
-- Model Framework: {{ SupportedFrameworksEnum.get_name(project_model_framework) }}
+- Model Framework:
+{% if project_model_source == ModelSourceEnum.huggingface.value %}
+{{ SupportedFrameworksEnum.get_name("pytorch") }}
+{% else %}
+{{ SupportedFrameworksEnum.get_name(project_model_framework) }}
+{% endif %}
 
 - Steps to obtain IR:
 {% if project_model_source == ModelSourceEnum.ir.value %}
@@ -34,7 +39,7 @@ No conversion to IR required. Download the model with the Model Downloader and p
 Download the model with Model Downloader and then convert it to IR format with Model Converter.
 {%      endif %}
 {% elif project_model_source == ModelSourceEnum.huggingface.value %}
-Your original model is PyTorch format. Use `transformers.onnx` CLI tool to convert it to ONNX, than convert the model to the IR format with Model Optimizer.
+Your original model is in the PyTorch format. Use `transformers.onnx` CLI tool to convert it to ONNX, then convert the model to the IR format with Model Optimizer.
 {% elif project_model_source == ModelSourceEnum.original.value %}
 Your original model is in one of the supported frameworks. Convert model to IR format with Model Optimizer.
 {% endif %}
diff --git a/wb/main/jupyter_notebooks/cell_templates/profiling_docs_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/profiling_docs_cell.jinja
@@ -2,11 +2,16 @@
 
 ### Motivation
 
-Model performance is the amount of information that your model can process per unit of time. In Computer Vision model performance defines how fast your model can process a number of images and generate the desired output. Usually it is measured in Frames Per Second (FPS).
+Model performance is the amount of information that your model can process per unit of time.
+{% if is_nlp %}In NLP, model performance defines how fast your model can process a number of text samples and generate the desired output. Usually it is measured in Samples Per Second (SPS).
+{% else %}
+In Computer Vision model performance defines how fast your model can process a number of images and generate the desired output. Usually it is measured in Frames Per Second (FPS).
+{% endif %}
 OpenVINO uses the term Inference to denote the stage of a single network execution.
 Inference is the stage in which a trained model is used to infer/predict the testing samples and comprises of a similar forward pass as training to predict the values.
 
 In OpenVINO toolkit inference is performed by Benchmark Tool.
+{% if is_nlp %} Note that Benchmark Tool was initially developed for Computer Vision (CV) use case and reports inference results in Frames Per Second (FPS in CV = SPS in NLP). {% endif %}
 
 ### OpenVINO Tool: Benchmark Tool
 

diff --git a/wb/main/jupyter_notebooks/cell_templates/transformers_onnx_converter_docs_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/transformers_onnx_converter_docs_cell.jinja
@@ -1,9 +1,32 @@
 ### Get ONNX model from Hugging Face Hub
 
-To obtain an ONNX model use [transformers.onnx](https://huggingface.co/docs/transformers/main/en/serialization#onnx) CLI tool from the `transformers` library.
+#### Motivation
 
-It will execute this steps:
-1. Download the model files and tokenizer files from the Hugging Face Hub
-1. Generate the dummy input with the tokenizer and pass it to the model to trace the model execution graph
-1. Use the execution graph to generate ONNX model
-1. Check the that the result model outputs is close to the original one
+Most of the models on the Hugging Face Hub are stored in the PyTorch format.
+To get an Intermediate Representation (IR) - preferred model format to work with OpenVINO - the model should be converted to ONNX first.
+One can do this `transformers.onnx` CLI tool from the Transformers library, which is external to OpenVINO.
+
+#### Main usage
+
+`transformers.onnx` tool takes the name of the model repository from Hugging Face Hub and the task that the model should solve.
+Then it downloads all necessary files, converts the model to ONNX format, and checks the resulting model.
+
+#### Description
+
+`transformers.onnx` will execute the following steps:
+
+1. Download the model files and tokenizer files from the Hugging Face Hub.
+1. Generate the dummy input with the tokenizer and pass it to the model to trace the model execution graph.
+1. Use the execution graph to generate ONNX model.
+1. Check that the result model output is close to the original model output.
+
+To learn more about this CLI tool read the [documentation](https://huggingface.co/docs/transformers/main/en/serialization#onnx).
+
+#### Used Command-Line Arguments
+
+<details>
+<summary>View transformers.onnx command-line arguments</summary>
+
+{{ CLIToolEnum.transformers_onnx.format_to_markdown_table() | safe }}
+
+</details>
diff --git a/wb/main/jupyter_notebooks/cell_templates/transformers_onnx_converter_result_docs_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/transformers_onnx_converter_result_docs_cell.jinja
@@ -0,0 +1,2 @@
+As a result, we have converted PyTorch model to ONNX format with transformers.onnx tool.
+You can find the model in `onnx` directory.
diff --git a/wb/main/jupyter_notebooks/cell_templates/validate_ir_model_code_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/validate_ir_model_code_cell.jinja
@@ -3,9 +3,9 @@
 
 from openvino.runtime import Core
 
-# Create an Inference Engine instance
+# Create an OpenVINO Core instance
 core = Core()
 
 # Read the network from IR files
 model = core.read_model(model=model_xml_file_path, weights=model_bin_file_path)
-print(f'Model {model.friendly_name} was successfully loaded to Inference Engine.')
+print(f'Model {model.friendly_name} was successfully loaded to OpenVINO.')
diff --git a/wb/main/jupyter_notebooks/cell_templates/validate_ir_model_docs_cell.jinja b/wb/main/jupyter_notebooks/cell_templates/validate_ir_model_docs_cell.jinja
@@ -1 +1 @@
-Let's check that your model is a valid OpenVINO IR file. To do that, we use OpenVINO Inference Engine Python* API. Refer to the [documentation](https://docs.openvino.ai/latest/openvino_inference_engine_ie_bridges_python_docs_api_overview.html) for more details.
+Let's check that your model is a valid OpenVINO IR file. To do that, we use OpenVINO Python* API. Refer to the [documentation](https://docs.openvino.ai/latest/api/ie_python_api/api.html) for more details.
diff --git a/wb/main/jupyter_notebooks/cli_tools_options.py b/wb/main/jupyter_notebooks/cli_tools_options.py
@@ -151,6 +151,9 @@ class CLIToolEnum(enum.Enum):
     pot = CLITool(path='pot',
                   displayed_options={'-c', '--output-dir', '--direct-dump'})
 
+    transformers_onnx = CLITool(path='python -m transformers.onnx',
+                                displayed_options=set())
+
     def format_to_markdown_table(self) -> str:
         return CLIToolHelpToMarkdownTableFormatter.format(cli_tool=self)
 

diff --git a/wb/main/jupyter_notebooks/jupyter_notebook_cell.py b/wb/main/jupyter_notebooks/jupyter_notebook_cell.py
@@ -67,6 +67,7 @@ class NotebookCellIds(enum.Enum):
     tokenize_dataset_code = 'tokenize_dataset_code'
     tokenizer_parameters_code = 'tokenizer_parameters_code'
     transformers_onnx_converter_docs = 'transformers_onnx_converter_docs'
+    transformers_onnx_converter_result_docs = 'transformers_onnx_converter_result_docs'
     transformers_onnx_converter_code = 'transformers_onnx_converter_code'
 
 
@@ -152,6 +153,12 @@ class NotebookCells:
         template_filename='transformers_onnx_converter_docs_cell.jinja'
     )
 
+    transformers_onnx_converter_result_docs = NotebookCellConfig(
+        cell_id=NotebookCellIds.transformers_onnx_converter_result_docs,
+        cell_type=NotebookCellTypes.markdown,
+        template_filename='transformers_onnx_converter_result_docs_cell.jinja'
+    )
+
     transformers_onnx_converter_code = NotebookCellConfig(
         cell_id=NotebookCellIds.transformers_onnx_converter_code,
         cell_type=NotebookCellTypes.code,

diff --git a/wb/main/jupyter_notebooks/notebook_template_creator.py b/wb/main/jupyter_notebooks/notebook_template_creator.py
@@ -103,6 +103,7 @@ def _obtain_model_section_cells(self) -> List[NotebookCellConfig]:
                 NotebookCells.obtain_model_docs,
                 NotebookCells.transformers_onnx_converter_docs,
                 NotebookCells.transformers_onnx_converter_code,
+                NotebookCells.transformers_onnx_converter_result_docs,
                 NotebookCells.model_optimizer_docs,
                 NotebookCells.model_optimizer_code,
                 NotebookCells.model_optimizer_result_docs,

diff --git a/wb/main/models/jupyter_notebook_model.py b/wb/main/models/jupyter_notebook_model.py
@@ -26,14 +26,14 @@
 from config.constants import JUPYTER_NOTEBOOKS_FOLDER, ESSENTIAL_DATA_FOLDER
 from wb.main.console_tool_wrapper.model_optimizer.tool import ModelOptimizerTool
 from wb.main.enumerates import ModelPrecisionEnum, JobTypesEnum, StatusEnum, OptimizationTypesEnum, ModelDomainEnum, \
-    ModelShapeTypeEnum
+    ModelShapeTypeEnum, SupportedFrameworksEnum, ModelSourceEnum
 from wb.main.jupyter_notebooks.cell_template_contexts import IntroCellTemplateContext, \
     SetIRModelPathsCodeCellTemplateContext, ProfilingCodeCellTemplateContext, AccuracyDocsCellTemplateContext, \
     AccuracyCodeCellTemplateContext, Int8OptimizationCodeCellTemplateContext, Int8OptimizationDocsCellTemplateContext, \
     ObtainModelDocsCellTemplateContext, ModelDownloaderCodeCellTemplateContext, \
     CheckModelFormatDocsCellTemplateContext, ModelConverterCodeCellTemplateContext, \
     ModelOptimizerCodeCellTemplateContext, InstallRequirementsCodeCellTemplateContext, \
-    TokenizerParametersTemplateContext, TransformersONNXCodeCellTemplateContext
+    TokenizerParametersTemplateContext, TransformersONNXCodeCellTemplateContext, ProfilingDocsCellTemplateContext
 from wb.main.jupyter_notebooks.cli_tools_options import CLIToolEnum
 from wb.main.jupyter_notebooks.config_file_dumpers import AccuracyConfigFileDumper, Int8OptimizationConfigFileDumper
 from wb.main.jupyter_notebooks.jupyter_notebook_cell import NotebookCellIds
@@ -175,8 +175,12 @@ def _intro_cell_template_context(self) -> IntroCellTemplateContext:
         topology: 'TopologiesModel' = original_project.topology
         topology_json: dict = topology.json()
         model_task_type = topology_json.get('accuracyConfiguration', {}).get('taskType')
-        project_model_framework = topology.original_model_framework.value
         model_precisions = topology.get_precisions()
+        model_source = topology.source.get_name() if topology.source else None
+        if topology.source is ModelSourceEnum.huggingface:
+            project_model_framework = SupportedFrameworksEnum.pytorch.value
+        else:
+            project_model_framework = topology.original_model_framework.value
         mo_params = topology_json.get('analysis', {}).get('moParams', {})
         topology_analysis_precision = ModelPrecisionEnum.fp16.value
         if mo_params:
@@ -191,6 +195,7 @@ def _intro_cell_template_context(self) -> IntroCellTemplateContext:
             project_model_task_type=model_task_type,
             project_model_framework=project_model_framework,
             project_model_precisions=model_precisions,
+            project_model_source=model_source,
             has_tokenizer_section=self._has_tokenizer_section,
             has_accuracy_checker_section=self._has_accuracy_checker_section,
             has_int8_calibration_section=self._has_int8_calibration_section,
@@ -329,6 +334,10 @@ def _profiling_code_cell_template_context(self) -> ProfilingCodeCellTemplateCont
             has_tokenizer_section=self._has_tokenizer_section or self.project.topology.domain is ModelDomainEnum.CV,
         )
 
+    @property
+    def _profiling_docs_cell_template_context(self) -> ProfilingDocsCellTemplateContext:
+        return ProfilingDocsCellTemplateContext(is_nlp=self.project.topology.domain is ModelDomainEnum.NLP)
+
     def _get_input_file_mapping_for_profiling(self, batch: int, streams: int) -> str:
         input_names = [input_['name'] for input_ in self.project.topology.meta.layout_configuration]
         number_of_samples = min(batch * streams, self.project.dataset.number_images)
@@ -462,6 +471,7 @@ def _transformers_onnx_template_context(self) -> TransformersONNXCodeCellTemplat
         NotebookCellIds.set_optimized_ir_model_paths_docs: _set_optimized_ir_model_paths_docs_cell_template_context,
         NotebookCellIds.set_optimized_ir_model_paths_code: _set_optimized_ir_model_paths_code_cell_template_context,
         NotebookCellIds.profiling_code: _profiling_code_cell_template_context,
+        NotebookCellIds.profiling_docs: _profiling_docs_cell_template_context,
         NotebookCellIds.accuracy_docs: _accuracy_docs_cell_template_context,
         NotebookCellIds.check_accuracy_config_code: _accuracy_docs_cell_template_context,
         NotebookCellIds.accuracy_code: _accuracy_code_cell_template_context,
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		As a result, we have converted PyTorch model to ONNX format with transformers.onnx tool.
		You can find the model in `onnx` directory.
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		Let's check that your model is a valid OpenVINO IR file. To do that, we use OpenVINO Inference Engine Python* API. Refer to the [documentation](https://docs.openvino.ai/latest/openvino_inference_engine_ie_bridges_python_docs_api_overview.html) for more details.
		Let's check that your model is a valid OpenVINO IR file. To do that, we use OpenVINO Python* API. Refer to the [documentation](https://docs.openvino.ai/latest/api/ie_python_api/api.html) for more details.