|
| 1 | +# mask_rcnn_inception_resnet_v2_atrous_coco |
| 2 | + |
| 3 | +## Use Case and High-Level Description |
| 4 | + |
| 5 | +Mask R-CNN Inception Resnet V2 Atrous trained on COCO dataset. Used for object instance segmentation. For details, see [paper](https://arxiv.org/pdf/1703.06870.pdf). |
| 6 | + |
| 7 | +## Example |
| 8 | + |
| 9 | +## Specification |
| 10 | + |
| 11 | +| Metric | Value | |
| 12 | +|---------------------------------|-------------------------------------------| |
| 13 | +| Type | Instance segmentation | |
| 14 | +| GFlops | 675.314 | |
| 15 | +| MParams | 92.368 | |
| 16 | +| Source framework | TensorFlow\* | |
| 17 | + |
| 18 | +## Performance |
| 19 | + |
| 20 | +## Input |
| 21 | + |
| 22 | +### Original Model |
| 23 | + |
| 24 | +Image, name: `image_tensor`, shape: [1x800x800x3], format: [BxHxWxC], |
| 25 | + where: |
| 26 | + |
| 27 | + - B - batch size |
| 28 | + - H - image height |
| 29 | + - W - image width |
| 30 | + - C - number of channels |
| 31 | + |
| 32 | + Expected color order: RGB. |
| 33 | + |
| 34 | +### Converted Model |
| 35 | + |
| 36 | +1. Image, name: `image_tensor`, shape: [1x3x800x800], format: [BxCxHxW], |
| 37 | + where: |
| 38 | + |
| 39 | + - B - batch size |
| 40 | + - C - number of channels |
| 41 | + - H - image height |
| 42 | + - W - image width |
| 43 | + |
| 44 | + Expected color order: BGR. |
| 45 | + |
| 46 | +2. Information of input image size, name: `image_info`, shape: [1x3], format: [BxC], |
| 47 | + where: |
| 48 | + |
| 49 | + - B - batch size |
| 50 | + - C - vector of 3 values in format [H,W,S], where H is an image height, W is an image width, S is an image scale factor (usually 1) |
| 51 | + |
| 52 | +## Output |
| 53 | + |
| 54 | +### Original Model |
| 55 | + |
| 56 | +1. Classifier, name: `detection_classes`. Contains predicted bounding boxes classes in a range [1, 91]. The model was trained on the Microsoft\* COCO dataset version with 90 categories of objects, 0 class is for background. |
| 57 | +2. Probability, name: `detection_scores`. Contains probability of detected bounding boxes. |
| 58 | +3. Detection box, name: `detection_boxes`. Contains detection boxes coordinates in a format `[y_min, x_min, y_max, x_max]`, where (`x_min`, `y_min`) are coordinates of the top left corner, (`x_max`, `y_max`) are coordinates of the right bottom corner. Coordinates are rescaled to input image size. |
| 59 | +4. Detections number, name: `num_detections`. Contains the number of predicted detection boxes. |
| 60 | +5. Segmentation mask, name: `detection_masks`. Contains segmentation heatmaps of detected objects for all classes for every output bounding box. |
| 61 | + |
| 62 | +### Converted Model |
| 63 | + |
| 64 | +1. The array of summary detection information, name: `reshape_do_2d`, shape: [N, 7], where N is the number of detected |
| 65 | +bounding boxes. For each detection, the description has the format: |
| 66 | +[`image_id`, `label`, `conf`, `x_min`, `y_min`, `x_max`, `y_max`], |
| 67 | + where: |
| 68 | + |
| 69 | + - `image_id` - ID of the image in the batch |
| 70 | + - `label` - predicted class ID |
| 71 | + - `conf` - confidence for the predicted class |
| 72 | + - (`x_min`, `y_min`) - coordinates of the top left bounding box corner (coordinates stored in normalized format, in range [0, 1]) |
| 73 | + - (`x_max`, `y_max`) - coordinates of the bottom right bounding box corner (coordinates stored in normalized format, in range [0, 1]) |
| 74 | +2. Segmentation heatmaps for all classes for every output bounding box, name: `masks`, shape: [N, 90, 33, 33], where N is the number of detected masks, 90 is the number of classes, with background class excluded. |
| 75 | + |
| 76 | +## Legal Information |
| 77 | + |
| 78 | +[https://raw.githubusercontent.com/tensorflow/models/master/LICENSE]() |
0 commit comments