as-suvorov
diff --git a/‎models/intel/age-gender-recognition-retail-0013/description/age-gender-recognition-retail-0013.md
+3-3 b/‎models/intel/age-gender-recognition-retail-0013/description/age-gender-recognition-retail-0013.md
+3-3
diff --git a/‎models/intel/asl-recognition-0004/description/asl-recognition-0004.md
+1-1 b/‎models/intel/asl-recognition-0004/description/asl-recognition-0004.md
+1-1
diff --git a/‎models/intel/emotions-recognition-retail-0003/description/emotions-recognition-retail-0003.md
+1-1 b/‎models/intel/emotions-recognition-retail-0003/description/emotions-recognition-retail-0003.md
+1-1
diff --git a/‎models/intel/emotions-recognition-retail-0003/emotions-recognition-retail-0003.prototxt
+1-1 b/‎models/intel/emotions-recognition-retail-0003/emotions-recognition-retail-0003.prototxt
+1-1
diff --git a/‎models/intel/face-detection-0100/description/face-detection-0100.md
+1-1 b/‎models/intel/face-detection-0100/description/face-detection-0100.md
+1-1
diff --git a/‎models/intel/face-detection-0102/description/face-detection-0102.md
+1-1 b/‎models/intel/face-detection-0102/description/face-detection-0102.md
+1-1
diff --git a/‎models/intel/face-detection-0104/description/face-detection-0104.md
+1-1 b/‎models/intel/face-detection-0104/description/face-detection-0104.md
+1-1
diff --git a/‎models/intel/face-detection-0105/description/face-detection-0105.md
+1-1 b/‎models/intel/face-detection-0105/description/face-detection-0105.md
+1-1
diff --git a/‎models/intel/face-detection-0106/description/face-detection-0106.md
+9-9 b/‎models/intel/face-detection-0106/description/face-detection-0106.md
+9-9
diff --git a/‎models/intel/face-detection-adas-0001/description/face-detection-adas-0001.md
+1-1 b/‎models/intel/face-detection-adas-0001/description/face-detection-adas-0001.md
+1-1
diff --git a/‎models/intel/face-detection-adas-binary-0001/description/face-detection-adas-binary-0001.md
+1-1 b/‎models/intel/face-detection-adas-binary-0001/description/face-detection-adas-binary-0001.md
+1-1
diff --git a/‎models/intel/face-detection-retail-0004/description/face-detection-retail-0004.md
+1-1 b/‎models/intel/face-detection-retail-0004/description/face-detection-retail-0004.md
+1-1
diff --git a/‎models/intel/face-detection-retail-0004/face-detection-retail-0004.prototxt
+1-1 b/‎models/intel/face-detection-retail-0004/face-detection-retail-0004.prototxt
+1-1
diff --git a/‎models/intel/face-detection-retail-0005/description/face-detection-retail-0005.md
+1-1 b/‎models/intel/face-detection-retail-0005/description/face-detection-retail-0005.md
+1-1
diff --git a/‎models/intel/face-reidentification-retail-0095/description/face-reidentification-retail-0095.md
+8-8 b/‎models/intel/face-reidentification-retail-0095/description/face-reidentification-retail-0095.md
+8-8
diff --git a/‎models/intel/faster-rcnn-resnet101-coco-sparse-60-0001/description/faster-rcnn-resnet101-coco-sparse-60-0001.md
+16-16 b/‎models/intel/faster-rcnn-resnet101-coco-sparse-60-0001/description/faster-rcnn-resnet101-coco-sparse-60-0001.md
+16-16
diff --git a/‎models/intel/handwritten-japanese-recognition-0001/description/handwritten-japanese-recognition-0001.md
+5-5 b/‎models/intel/handwritten-japanese-recognition-0001/description/handwritten-japanese-recognition-0001.md
+5-5
diff --git a/‎models/intel/human-pose-estimation-0001/description/human-pose-estimation-0001.md
+7-7 b/‎models/intel/human-pose-estimation-0001/description/human-pose-estimation-0001.md
+7-7
diff --git a/‎models/intel/human-pose-estimation-0001/human-pose-estimation-0001.prototxt
+1-1 b/‎models/intel/human-pose-estimation-0001/human-pose-estimation-0001.prototxt
+1-1
diff --git a/‎models/intel/icnet-camvid-ava-0001/description/icnet-camvid-ava-0001.md
+4-4 b/‎models/intel/icnet-camvid-ava-0001/description/icnet-camvid-ava-0001.md
+4-4
diff --git a/‎models/intel/icnet-camvid-ava-sparse-30-0001/description/icnet-camvid-ava-sparse-30-0001.md
+4-4 b/‎models/intel/icnet-camvid-ava-sparse-30-0001/description/icnet-camvid-ava-sparse-30-0001.md
+4-4
diff --git a/‎models/intel/icnet-camvid-ava-sparse-60-0001/description/icnet-camvid-ava-sparse-60-0001.md
+4-4 b/‎models/intel/icnet-camvid-ava-sparse-60-0001/description/icnet-camvid-ava-sparse-60-0001.md
+4-4
diff --git a/‎models/intel/image-retrieval-0001/description/image-retrieval-0001.md
+1-1 b/‎models/intel/image-retrieval-0001/description/image-retrieval-0001.md
+1-1
@@ -40,12 +40,12 @@ applicable for children since their faces were not in the training set.
 
 ## Inputs
 
-Name: `input` , shape: [1x3x62x62] - An input image in [1xCxHxW] format. Expected color order is BGR.
+Name: `input`, shape: [1x3x62x62] - An input image in [1xCxHxW] format. Expected color order is BGR.
 
 ## Outputs
 
-1. name: "age_conv3", shape: [1, 1, 1, 1] - Estimated age divided by 100.
-2. name: "prob", shape: [1, 2, 1, 1] - Softmax output across 2 type classes [female, male]
+1. Name: `age_conv3`, shape: [1, 1, 1, 1] - Estimated age divided by 100.
+2. Name: `prob`, shape: [1, 2, 1, 1] - Softmax output across 2 type classes [female, male]
 
 ## Legal Information
 [*] Other names and brands may be claimed as the property of others.
@@ -27,7 +27,7 @@ on the input clip.
 
 ## Inputs
 
-Name: `input` , shape: [1x3x16x224x224]. An input image sequence in the format [BxCxTxHxW], where:
+Name: `input`, shape: [1x3x16x224x224]. An input image sequence in the format [BxCxTxHxW], where:
  - B - batch size
  - C - number of channels
  - T - duration of input clip
 
@@ -38,7 +38,7 @@ only the images containing five aforementioned emotions is chosen. The total amo
 
 ## Inputs
 
-Name: `input` , shape: [1x3x64x64] - An input image in [1xCxHxW] format. Expected color order is BGR.
+Name: `input`, shape: [1x3x64x64] - An input image in [1xCxHxW] format. Expected color order is BGR.
 
 ## Outputs
 
 
@@ -1,7 +1,7 @@
 name: "0003_EmoNet_ResNet10"
 layer {
   name: "data"
-  type: `input`
+  type: "Input"
   top: "data"
   input_param {
     shape {
 
@@ -28,7 +28,7 @@ curve. All numbers were evaluated by taking into account only faces bigger than
 
 ## Inputs
 
-Name: `input` , shape: [1x3x256x256] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x256x256] - An input image in the format [BxCxHxW],
 where:
 
 - B - batch size
 
@@ -28,7 +28,7 @@ curve. All numbers were evaluated by taking into account only faces bigger than
 
 ## Inputs
 
-Name: `input` , shape: [1x3x384x384] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x384x384] - An input image in the format [BxCxHxW],
 where:
 
 - B - batch size
 
@@ -28,7 +28,7 @@ curve. All numbers were evaluated by taking into account only faces bigger than
 
 ## Inputs
 
-Name: `input` , shape: [1x3x448x448] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x448x448] - An input image in the format [BxCxHxW],
 where:
 
 - B - batch size
 
@@ -27,7 +27,7 @@ curve. All numbers were evaluated by taking into account only faces bigger than
 
 ## Inputs
 
-Name: `input` , shape: [1x3x416x416] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x416x416] - An input image in the format [BxCxHxW],
 where:
 
    - B - batch size
 
@@ -27,26 +27,26 @@ curve. All numbers were evaluated by taking into account only faces bigger than
 
 ## Inputs
 
-1. Name: `input` , shape: [1x3x640x640] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x640x640] - An input image in the format [BxCxHxW],
 where:
 
-   - B - batch size
-   - C - number of channels
-   - H - image height
-   - W - image width
+- B - batch size
+- C - number of channels
+- H - image height
+- W - image width
 
 Expected color order: BGR.
 
 ## Outputs
 
-1. The "boxes" is a blob with shape: [N, 5], where N is the number of detected
-   bounding boxes. For each detection, the description has the format:
+1. The `boxes` is a blob with the shape [N, 5], where N is the number of detected
+   bounding boxes. For each detection, the description has the format
    [`x_min`, `y_min`, `x_max`, `y_max`, `conf`],
    where:
     - (`x_min`, `y_min`) - coordinates of the top left bounding box corner
-    - (`x_max`, `y_max`) - coordinates of the bottom right bounding box corner.
+    - (`x_max`, `y_max`) - coordinates of the bottom right bounding box corner
     - `conf` - confidence for the predicted class
-2. The "labels" is a blob with shape: [N], where N is the number of detected
+2. The `labels` is a blob with the shape [N], where N is the number of detected
    bounding boxes. It contains `label` per each detected box.
 
 ## Legal Information
 
@@ -32,7 +32,7 @@ curve. Numbers are on
 
 ## Inputs
 
-Name: `input` , shape: [1x3x384x672] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x384x672] - An input image in the format [BxCxHxW],
    where:
     - B - batch size
     - C - number of channels
 
@@ -34,7 +34,7 @@ curve. Numbers are on
 
 ## Inputs
 
-Name: `input` , shape: [1x3x384x672] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x384x672] - An input image in the format [BxCxHxW],
    where:
     - B - batch size
     - C - number of channels
 
@@ -29,7 +29,7 @@ curve. All numbers were evaluated by taking into account only faces bigger than
 
 ## Inputs
 
-Name: `input` , shape: [1x3x300x300] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x300x300] - An input image in the format [BxCxHxW],
 where:
 
    - B - batch size
 
@@ -1,7 +1,7 @@
 name: "cnn_fd_004_sq_light_ssd"
 layer {
   name: "data"
-  type: `input`
+  type: "Input"
   top: "data"
   input_param {
     shape {
 
@@ -28,7 +28,7 @@ curve. All numbers were evaluated by taking into account only faces bigger than
 
 ## Inputs
 
-Name: `input` , shape: [1x3x300x300] - An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x300x300] - An input image in the format [BxCxHxW],
 where:
 
    - B - batch size
 
@@ -36,14 +36,14 @@ To align the face, use a landmarks regression model: using regressed points and
 
 ## Inputs
 
-1. Name: "data" , shape: [1x3x128x128] - An input image in the format [BxCxHxW],
-   where:
-    - B - batch size
-    - C - number of channels
-    - H - image height
-    - W - image width
-
-   Expected color order is BGR.
+Name: "data" , shape: [1x3x128x128] - An input image in the format [BxCxHxW],
+where:
+- B - batch size
+- C - number of channels
+- H - image height
+- W - image width
+
+Expected color order is BGR.
 
 ## Outputs
 The net outputs a blob with the shape [1, 256, 1, 1], containing a row-vector of 256 floating point values. Outputs on different images are comparable in cosine distance.
 
@@ -2,11 +2,11 @@
 
 ## Use Case and High-Level Description
 
-This is a re-trained version of [Faster R-CNN](https://arxiv.org/abs/1506.01497) object detection network trained with COCO\* training dataset.
+This is a retrained version of the [Faster R-CNN](https://arxiv.org/abs/1506.01497) object detection network trained with the COCO\* training dataset.
 The actual implementation is based on [Detectron](https://github.com/facebookresearch/detectron2),
 with additional [network weight pruning](https://arxiv.org/abs/1710.01878) applied to sparsify convolution layers (60% of network parameters are set to zeros).
 
-The model input is a blob that consists of a single image of "1x3x800x1280" in BGR order. The pixel values are integers in the [0, 255] range.
+The model input is a blob that consists of a single image of `1x3x800x1280` in the BGR order. The pixel values are integers in the [0, 255] range.
 
 ## Specification
 
@@ -17,28 +17,28 @@ The model input is a blob that consists of a single image of "1x3x800x1280" in B
 | MParams                      | 52.79        |
 | Source framework             | TensorFlow\* |
 
-Average Precision metric described in: ["COCO: Common Objects in Context"](http://cocodataset.org/#detection-eval). The primary challenge metric is used. Tested on COCO validation dataset.
+See Average Precision metric description at [COCO: Common Objects in Context](http://cocodataset.org/#detection-eval). The primary challenge metric is used. Tested on the COCO validation dataset.
 
 ## Performance
 
 ## Inputs
 
-Name: `input` , shape: [1x3x800x1280] - An input image in the format [BxCxHxW],
-  where:
-    - B - batch size
-    - C - number of channels
-    - H - image height
-    - W - image width.
-  Expected color order is BGR.
+Name: `input`, shape: [1x3x800x1280] - An input image in the format [BxCxHxW],
+where:
+  - B - batch size
+  - C - number of channels
+  - H - image height
+  - W - image width
+Expected color order is BGR.
 
 ## Outputs
 
-1. The net outputs a blob with the shape: [300, 7], where each row is consisted of [`image_id`, `class_id`, `confidence`, `x0`, `y0`, `x1`, `y1`], respectively.
-    - `image_id` - image ID in the batch
-    - `class_id` - predicted class ID
-    - `confidence` - [0, 1] detection score, the higher the value, the more confident the deteciton is on
-    - (`x0`, `y0`) - normalized coordinates of the top left bounding box corner, in range of [0, 1]
-    - (`x1`, `y1`) - normalized coordinates of the bootm right bounding box corner, in range of [0, 1].
+The net outputs a blob with the shape [300, 7], where each row consists of [`image_id`, `class_id`, `confidence`, `x0`, `y0`, `x1`, `y1`] respectively:
+- `image_id` - image ID in the batch
+- `class_id` - predicted class ID
+- `confidence` - [0, 1] detection score; the higher the value, the more confident the detection is 
+- (`x0`, `y0`) - normalized coordinates of the top left bounding box corner, in the [0, 1] range
+- (`x1`, `y1`) - normalized coordinates of the bottom right bounding box corner, in the [0, 1] range
 
 ## Legal Information
 [\*] Other names and brands may be claimed as the property of others.
 
@@ -2,8 +2,9 @@
 
 ## Use Case and High-Level Description
 
-This is a network for handwritten japanese text recognition scenario. It consists of VGG16-like backbone, reshape layer and a fully connected layer.
-The network is able to recognize japanese text (characters in datasets [Kondate](http://web.tuat.ac.jp/~nakagawa/database/en/kondate_about.html) and [Nakayosi](http://web.tuat.ac.jp/~nakagawa/database/en/about_nakayosi.html)).
+This is a network for handwritten Japanese text recognition scenario. It consists of a VGG16-like backbone, 
+reshape layer and a fully connected layer.
+The network is able to recognize Japanese text. For details on characters in datasets, see [Kondate](http://web.tuat.ac.jp/~nakagawa/database/en/kondate_about.html) and [Nakayosi](http://web.tuat.ac.jp/~nakagawa/database/en/about_nakayosi.html).
 
 ## Example
 
@@ -32,9 +33,10 @@ where:
   - H - image height
   - W - image width
 
-Note that the source image should be converted to grayscale, resized to spefic height (such as 96) while keeping aspect ratio, normalized to [-1, 1] and right bottom padded
+Note that the source image should be converted to grayscale, resized to specific height (such as 96) while keeping aspect ratio, normalized to [-1, 1], and right-bottom padded.
 
 ## Outputs
+
 The net outputs a blob with the shape [186, 1, 1161] in the format [WxBxL],
 where:
   - W - output sequence length
@@ -43,7 +45,5 @@ where:
 
 The network output can be decoded by CTC Greedy Decoder.
 
-
-
 ## Legal Information
 [*] Other names and brands may be claimed as the property of others.
@@ -27,13 +27,13 @@ Tested on a COCO validation subset from the original paper [Realtime Multi-Perso
 
 ## Inputs
 
-1. Name: `input` , shape: [1x3x256x456]. An input image in the [BxCxHxW] format ,
-  where:
-    - B - batch size
-    - C - number of channels
-    - H - image height
-    - W - image width.
-  Expected color order is BGR.
+Name: `input`, shape: [1x3x256x456]. An input image in the [BxCxHxW] format ,
+where:
+  - B - batch size
+  - C - number of channels
+  - H - image height
+  - W - image width
+Expected color order is BGR.
 
 ## Outputs
 
 
@@ -1,6 +1,6 @@
 layer {
   name: "data"
-  type: `input`
+  type: "Input"
   top: "data"
   input_param {shape: {dim: 1 dim: 3 dim: 256 dim: 456}}
 }
 
@@ -2,9 +2,9 @@
 
 ## Use Case and High-Level Description
 
-A trained model of ICNet for fast semantic segmentation, trained on the CamVid\* dataset from scratch using the TensorFlow\* framework. For more details about the original floating point model, check out the [paper](https://arxiv.org/abs/1704.08545).
+A trained model of ICNet for fast semantic segmentation, trained on the CamVid\* dataset from scratch using the TensorFlow\* framework. For details about the original floating-point model, check out [ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545).
 
-The model input is a blob that consists of a single image of "1x3x720x960" in BGR order. The pixel values are integers in the [0, 255] range.
+The model input is a blob that consists of a single image of `1x3x720x960` in the BGR order. The pixel values are integers in the [0, 255] range.
 
 The model output for `icnet-camvid-ava-0001` is the predicted class index of each input pixel belonging to one of the 12 classes of the CamVid dataset.
 
@@ -18,7 +18,7 @@ The model output for `icnet-camvid-ava-0001` is the predicted class index of eac
 
 ## Accuracy
 
-The quality metrics were calculated on the CamVid\* validation dataset. The 'unlabeled' class had been ignored during metrics calculation.
+The quality metrics were calculated on the CamVid\* validation dataset. The `unlabeled` class had been ignored during metrics calculation.
 
 | Metric                    | Value         |
 |---------------------------|---------------|
@@ -51,7 +51,7 @@ Semantic segmentation class prediction map, shape - `1,720,960`, output data for
 - `H` - horizontal coordinate of the input pixel
 - `W` - vertical coordinate of the input pixel
 
-containing the class prediction result of each pixel.
+Output contains the class prediction result of each pixel.
 
 ## Legal Information
 [*] Other names and brands may be claimed as the property of others.
@@ -2,9 +2,9 @@
 
 ## Use Case and High-Level Description
 
-A trained model of ICNet for fast semantic segmentation, trained on the CamVid\* dataset from scratch using the TensorFlow\* framework. The trained model has 30% sparsity (ratio of 0's within all the convolution kernel weights). For more details about the original floating point model, check out the [paper](https://arxiv.org/abs/1704.08545).
+A trained model of ICNet for fast semantic segmentation, trained on the CamVid\* dataset from scratch using the TensorFlow\* framework. The trained model has 30% sparsity (ratio of zeros within all the convolution kernel weights). For details about the original floating-point model, check out the [ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545).
 
-The model input is a blob that consists of a single image of "1x3x720x960" in BGR order. The pixel values are integers in the [0, 255] range.
+The model input is a blob that consists of a single image of `1x3x720x960` in the BGR order. The pixel values are integers in the [0, 255] range.
 
 The model output for `icnet-camvid-ava-sparse-30-0001` is the predicted class index of each input pixel belonging to one of the 12 classes of the CamVid dataset.
 
@@ -18,7 +18,7 @@ The model output for `icnet-camvid-ava-sparse-30-0001` is the predicted class in
 
 ## Accuracy
 
-The quality metrics were calculated on the CamVid\* validation dataset. The 'unlabeled' class had been ignored during metrics calculation.
+The quality metrics were calculated on the CamVid\* validation dataset. The `unlabeled` class had been ignored during metrics calculation.
 
 | Metric                    | Value         |
 |---------------------------|---------------|
@@ -51,7 +51,7 @@ Semantic segmentation class prediction map, shape - `1,720,960`, output data for
 - `H` - horizontal coordinate of the input pixel
 - `W` - vertical coordinate of the input pixel
 
-containing the class prediction result of each pixel.
+Output contains the class prediction result of each pixel.
 
 ## Legal Information
 [*] Other names and brands may be claimed as the property of others.
@@ -2,9 +2,9 @@
 
 ## Use Case and High-Level Description
 
-A trained model of ICNet for fast semantic segmentation, trained on the CamVid\* dataset from scratch using the TensorFlow\* framework. The trained model has 60% sparsity (ratio of 0's within all the convolution kernel weights). For more details about the original floating point model, check out the [paper](https://arxiv.org/abs/1704.08545).
+A trained model of ICNet for fast semantic segmentation, trained on the CamVid\* dataset from scratch using the TensorFlow\* framework. The trained model has 60% sparsity (ratio of zeros within all the convolution kernel weights). For details about the original floating-point model, check out the [ICNet for Real-Time Semantic Segmentation on High-Resolution Images](https://arxiv.org/abs/1704.08545).
 
-The model input is a blob that consists of a single image of "1x3x720x960" in BGR order. The pixel values are integers in the [0, 255] range.
+The model input is a blob that consists of a single image of `1x3x720x960` in the BGR order. The pixel values are integers in the [0, 255] range.
 
 The model output for `icnet-camvid-ava-sparse-60-0001` is the predicted class index of each input pixel belonging to one of the 12 classes of the CamVid dataset.
 
@@ -18,7 +18,7 @@ The model output for `icnet-camvid-ava-sparse-60-0001` is the predicted class in
 
 ## Accuracy
 
-The quality metrics were calculated on the CamVid\* validation dataset. The 'unlabeled' class had been ignored during metrics calculation.
+The quality metrics were calculated on the CamVid\* validation dataset. The `unlabeled` class had been ignored during metrics calculation.
 
 | Metric                    | Value         |
 |---------------------------|---------------|
@@ -51,7 +51,7 @@ Semantic segmentation class prediction map, shape - `1,720,960`, output data for
 - `H` - horizontal coordinate of the input pixel
 - `W` - vertical coordinate of the input pixel
 
-containing the class prediction result of each pixel.
+Output contains the class prediction result of each pixel.
 
 ## Legal Information
 [*] Other names and brands may be claimed as the property of others.
@@ -21,7 +21,7 @@ Image retrieval model based on [MobileNetV2](https://arxiv.org/abs/1801.04381) a
 
 ## Inputs
 
-Name: `input` , shape: [1x3x224x224] — An input image in the format [BxCxHxW],
+Name: `input`, shape: [1x3x224x224] — An input image in the format [BxCxHxW],
 where:
 
    - B - batch size
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`layer {`
`2`	`2`	`name: "data"`
`3`		- type: `input`
	`3`	`+ type: "Input"`
`4`	`4`	`top: "data"`
`5`	`5`	`input_param {shape: {dim: 1 dim: 3 dim: 256 dim: 456}}`
`6`	`6`	`}`