Skip to content

Commit 5da2e9a

Browse files
author
Anna Grebneva
authored
Added MODNet models (openvinotoolkit#3373)
* Added MODNet models * Added demo support
1 parent 272d2d7 commit 5da2e9a

File tree

20 files changed

+519
-9
lines changed

20 files changed

+519
-9
lines changed

data/dataset_definitions.yml

+11
Original file line numberDiff line numberDiff line change
@@ -1493,3 +1493,14 @@ datasets:
14931493
annotation_file: object_detection/streams_1/high/annotations/instances_glb2bcls3.json
14941494
annotation: mscoco_detection_high_3cls.pickle
14951495
dataset_meta: mscoco_detection_high_3cls.json
1496+
1497+
- name: HumanMattingDataset
1498+
data_source: human_matting_dataset/clip_img/1803151818/clip_00000000
1499+
additional_data_source: human_matting_dataset/matting/1803151818/matting_00000000
1500+
annotation_conversion:
1501+
converter: background_matting
1502+
images_dir: human_matting_dataset/clip_img/1803151818/clip_00000000
1503+
masks_dir: human_matting_dataset/matting/1803151818/matting_00000000
1504+
image_postfix: '.jpg'
1505+
annotation: human_matting.pickle
1506+
dataset_meta: human_matting.json

demos/background_subtraction_demo/python/README.md

+8-3
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,10 @@ The demo application expects an instance segmentation or background matting mode
2929
* At least two outputs including:
3030
* `fgr` with normalized in [0, 1] range foreground
3131
* `pha` with normalized in [0, 1] range alpha
32-
4. for video background matting models based on RNN architecture:
32+
4. for image background matting models without trimap (background segmentation):
33+
* Single input for input image.
34+
* Single output with normalized in [0, 1] range alpha
35+
5. for video background matting models based on RNN architecture:
3336
* Five inputs:
3437
* `src` for input image
3538
* recurrent inputs: `r1`, `r2`, `r3`, `r4`
@@ -81,10 +84,12 @@ omz_converter --list models.lst
8184

8285
### Supported Models
8386

84-
* instance-segmentation-person-????
85-
* yolact-resnet50-fpn-pytorch
8687
* background-matting-mobilenetv2
88+
* instance-segmentation-person-????
89+
* modnet-photographic-portrait-matting
90+
* modnet-webcam-portrait-matting
8791
* robust-video-matting-mobilenetv3
92+
* yolact-resnet50-fpn-pytorch
8893

8994
> **NOTE**: Refer to the tables [Intel's Pre-Trained Models Device Support](../../../models/intel/device_support.md) and [Public Pre-Trained Models Device Support](../../../models/public/device_support.md) for the details on models inference support at different devices.
9095

demos/background_subtraction_demo/python/background_subtraction_demo.py

+7-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,10 @@
2626

2727
sys.path.append(str(Path(__file__).resolve().parents[2] / 'common/python'))
2828

29-
from openvino.model_zoo.model_api.models import MaskRCNNModel, OutputTransform, RESIZE_TYPES, YolactModel, ImageMattingWithBackground, VideoBackgroundMatting
29+
from openvino.model_zoo.model_api.models import (
30+
MaskRCNNModel, OutputTransform, RESIZE_TYPES, YolactModel,
31+
ImageMattingWithBackground, VideoBackgroundMatting, PortraitBackgroundMatting
32+
)
3033
from openvino.model_zoo.model_api.models.utils import load_labels
3134
from openvino.model_zoo.model_api.performance_metrics import PerformanceMetrics
3235
from openvino.model_zoo.model_api.pipelines import get_user_config, AsyncPipeline
@@ -123,6 +126,9 @@ def get_model(model_adapter, configuration, args):
123126
model = ImageMattingWithBackground(model_adapter, configuration)
124127
need_bgr_input = True
125128
is_matting_model = True
129+
elif len(inputs) == 1 and len(outputs) == 1:
130+
model = PortraitBackgroundMatting(model_adapter, configuration)
131+
is_matting_model = True
126132
else:
127133
model = MaskRCNNModel(model_adapter, configuration)
128134
if not need_bgr_input and args.background is not None:
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# This file can be used with the --list option of the model downloader.
2-
instance-segmentation-person-????
3-
yolact-resnet50-fpn-pytorch
42
background-matting-mobilenetv2
3+
instance-segmentation-person-????
4+
modnet-photographic-portrait-matting
5+
modnet-webcam-portrait-matting
56
robust-video-matting-mobilenetv3
7+
yolact-resnet50-fpn-pytorch

demos/common/python/openvino/model_zoo/model_api/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ The following tasks can be solved with wrappers usage:
5959

6060
| Task type | Model API wrappers |
6161
|----------------------------|--------------------|
62-
| Background Matting | <ul><li>`VideoBackgroundMatting`</li><li>`ImageMattingWithBackground`</li></ul> |
62+
| Background Matting | <ul><li>`VideoBackgroundMatting`</li><li>`ImageMattingWithBackground`</li><li>`PortraitBackgroundMatting`</li></ul> |
6363
| Classification | <ul><li>`Classification`</li></ul> |
6464
| Deblurring | <ul><li>`Deblurring`</li></ul> |
6565
| Human Pose Estimation | <ul><li>`HpeAssociativeEmbedding`</li><li>`OpenPose`</li></ul> |

demos/common/python/openvino/model_zoo/model_api/models/__init__.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717

1818
from .bert import BertEmbedding, BertNamedEntityRecognition, BertQuestionAnswering
19-
from .background_matting import ImageMattingWithBackground, VideoBackgroundMatting
19+
from .background_matting import ImageMattingWithBackground, VideoBackgroundMatting, PortraitBackgroundMatting
2020
from .centernet import CenterNet
2121
from .classification import Classification
2222
from .deblurring import Deblurring
@@ -58,6 +58,7 @@
5858
'MonoDepthModel',
5959
'OpenPose',
6060
'OutputTransform',
61+
'PortraitBackgroundMatting',
6162
'RESIZE_TYPES',
6263
'RetinaFace',
6364
'RetinaFacePyTorch',

demos/common/python/openvino/model_zoo/model_api/models/background_matting.py

+33
Original file line numberDiff line numberDiff line change
@@ -150,3 +150,36 @@ def postprocess(self, outputs, meta):
150150
fgr = cv2.cvtColor(cv2.resize(fgr, (w, h)), cv2.COLOR_RGB2BGR)
151151
pha = np.expand_dims(cv2.resize(pha, (w, h)), axis=-1)
152152
return fgr, pha
153+
154+
155+
class PortraitBackgroundMatting(ImageModel):
156+
__model__ = 'Portrait-matting'
157+
158+
def __init__(self, model_adapter, configuration, preload=False):
159+
super().__init__(model_adapter, configuration, preload)
160+
self._check_io_number(1, 1)
161+
self.output_blob_name = self._get_outputs()
162+
163+
@classmethod
164+
def parameters(cls):
165+
return super().parameters()
166+
167+
def _get_outputs(self):
168+
output_blob_name = next(iter(self.outputs))
169+
output_size = self.outputs[output_blob_name].shape
170+
if len(output_size) != 4:
171+
self.raise_error("Unexpected output blob shape {}. Only 4D output blob is supported".format(output_size))
172+
173+
return output_blob_name
174+
175+
def preprocess(self, inputs):
176+
dict_inputs, meta = super().preprocess(inputs)
177+
meta.update({"original_image": inputs})
178+
return dict_inputs, meta
179+
180+
def postprocess(self, outputs, meta):
181+
output = outputs[self.output_blob_name][0].transpose(1, 2, 0)
182+
original_frame = meta['original_image'] / 255.0
183+
h, w = meta['original_shape'][:2]
184+
res_output = np.expand_dims(cv2.resize(output, (w, h)), -1)
185+
return original_frame, res_output

demos/tests/cases.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -757,7 +757,9 @@ def single_option_cases(key, *args):
757757
ModelArg('instance-segmentation-person-0007'),
758758
ModelArg('robust-video-matting-mobilenetv3'),
759759
ModelArg('background-matting-mobilenetv2'),
760-
ModelArg('yolact-resnet50-fpn-pytorch')),
760+
ModelArg('yolact-resnet50-fpn-pytorch'),
761+
ModelArg('modnet-photographic-portrait-matting'),
762+
ModelArg('modnet-webcam-portrait-matting')),
761763
)),
762764

763765
PythonDemo(name='bert_question_answering_demo', device_keys=['-d'], test_cases=combine_cases(

models/public/device_support.md

+2
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,8 @@
7878
| mobilenet-v3-large-1.0-224-tf | YES | YES | YES |
7979
| mobilenet-v3-small-1.0-224-tf | YES | YES | YES |
8080
| mobilenet-yolo-v4-syg | YES | YES | |
81+
| modnet-photographic-portrait-matting | YES | YES | YES |
82+
| modnet-webcam-portrait-matting | YES | YES | YES |
8183
| mozilla-deepspeech-0.6.1 | YES | | |
8284
| mozilla-deepspeech-0.8.2 | YES | | |
8385
| mtcnn-o | YES | YES | |

models/public/index.md

+4
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@
3333
:caption: Background Matting Models
3434
3535
omz_models_model_background_matting_mobilenetv2
36+
omz_models_model_modnet_photographic_portrait_matting
37+
omz_models_model_modnet_webcam_portrait_matting
3638
omz_models_model_robust_video_matting_mobilenetv3
3739
3840
.. toctree::
@@ -644,6 +646,8 @@ or mixed pixels. This distinguishes background matting from segmentation approac
644646
| Model Name | Implementation | OMZ Model Name | Accuracy | GFlops | mParams |
645647
| -------------- | -------------- | ------------------------------------------------------ | -------- | ------- | -------- |
646648
| background-matting-mobilenetv2 | PyTorch\* | [background-matting-mobilenetv2](./background-matting-mobilenetv2/README.md) | 4.32/1.0/2.48/2.7 | 6.7419 | 5.052 |
649+
| modnet-photographic-portrait-matting | PyTorch\* | [modnet-photographic-portrait-matting](./modnet-photographic-portrait-matting/README.md) | 5.21/727.95 | 31.1564 | 6.4597 |
650+
| modnet-webcam-portrait-matting | PyTorch\* | [modnet-webcam-portrait-matting](./modnet-webcam-portrait-matting/README.md) | 5.66/762.52 | 31.1564 | 6.4597 |
647651
| robust-video-matting-mobilenetv3 | PyTorch\* | [robust-video-matting-mobilenetv3](./robust-video-matting-mobilenetv3/README.md) | 20.8/15.1/4.42/4.05 | 9.3892 | 3.7363 |
648652

649653
## See Also
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# modnet-photographic-portrait-matting
2+
3+
## Use Case and High-Level Description
4+
5+
The `modnet-photographic-portrait-matting` model is a lightweight matting objective decomposition network (MODNet) for photographic portrait matting in real-time with a single input image with MobileNetV2 backbone. The model is pre-trained in PyTorch\* framework and converted to ONNX\* format.
6+
7+
More details provided in the [paper](https://arxiv.org/abs/2011.11961) and [repository](https://github.com/ZHKKKe/MODNet).
8+
9+
## Specification
10+
11+
| Metric | Value |
12+
|---------------------------------|--------------------|
13+
| Type | Background Matting |
14+
| GFlops | 31.1564 |
15+
| MParams | 6.4597 |
16+
| Source framework | PyTorch\* |
17+
18+
## Accuracy
19+
20+
Accuracy measured on the HumanMatting dataset
21+
22+
| Metric | Mean value | Std value |
23+
| -------- | ------------|-----------|
24+
| MAD | 5.21 | 5.13 |
25+
| MSE | 727.95 | 1196.28 |
26+
27+
* MAD - mean of absolute difference
28+
* MSE - mean squared error.
29+
30+
## Input
31+
32+
### Original Model
33+
34+
Image, name: `input`, shape: `1, 3, 512, 512`, format: `B, C, H, W`, where:
35+
36+
- `B` - batch size
37+
- `C` - number of channels
38+
- `H` - image height
39+
- `W` - image width
40+
41+
Expected color order: `RGB`.
42+
Mean values - [127.5, 127.5, 127.5], scale value - 127.5.
43+
44+
### Converted Model
45+
46+
Image, name: `input`, shape: `1, 3, 512, 512`, format: `B, C, H, W`, where:
47+
48+
- `B` - batch size
49+
- `C` - number of channels
50+
- `H` - image height
51+
- `W` - image width
52+
53+
Expected color order: `BGR`.
54+
55+
## Output
56+
57+
### Original model
58+
59+
Alpha matte with values in [0, 1] range. Name: `output` Shape: `1, 1, 512, 512`, format: `B, C, H, W`, where:
60+
61+
- `B` - batch size
62+
- `C` - number of channels
63+
- `H` - image height
64+
- `W` - image width
65+
66+
### Converted model
67+
68+
Alpha matte with values in [0, 1] range. Name: `output` Shape: `1, 1, 512, 512`, format: `B, C, H, W`, where:
69+
70+
- `B` - batch size
71+
- `C` - number of channels
72+
- `H` - image height
73+
- `W` - image width
74+
75+
## Download a Model and Convert it into OpenVINO™ IR Format
76+
77+
You can download models and if necessary convert them into OpenVINO™ IR format using the [Model Downloader and other automation tools](../../../tools/model_tools/README.md) as shown in the examples below.
78+
79+
An example of using the Model Downloader:
80+
```
81+
omz_downloader --name <model_name>
82+
```
83+
84+
An example of using the Model Converter:
85+
```
86+
omz_converter --name <model_name>
87+
```
88+
89+
## Demo usage
90+
91+
The model can be used in the following demos provided by the Open Model Zoo to show its capabilities:
92+
93+
* [Background subtraction Python\* Demo](../../../demos/background_subtraction_demo/python/README.md)
94+
95+
## Legal Information
96+
97+
The original model is distributed under the
98+
[Apache License, Version 2.0](https://raw.githubusercontent.com/ZHKKKe/MODNet/master/LICENSE).
99+
A copy of the license is provided in `<omz_dir>/models/public/licenses/APACHE-2.0.txt`.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
models:
2+
- name: modnet-photographic-portrait-matting
3+
launchers:
4+
- framework: openvino
5+
adapter: background_matting
6+
datasets:
7+
- name: HumanMattingDataset
8+
preprocessing:
9+
- type: resize
10+
size: 512
11+
postprocessing:
12+
- type: resize
13+
apply_to: annotation
14+
size: 512
15+
metrics:
16+
- type: mae
17+
name: MAD
18+
presenter: print_vector
19+
reference:
20+
mean: 5.213472
21+
std: 5.125874
22+
- type: mse
23+
name: MSE
24+
presenter: print_vector
25+
reference:
26+
mean: 727.952792
27+
std: 1196.277498
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright (c) 2022 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from torch import load
16+
from modnet_onnx import MODNet
17+
18+
19+
def create_modnet(weights):
20+
model = MODNet(backbone_pretrained=False)
21+
22+
checkpoint = load(weights, map_location='cpu')
23+
ckpt = {k.replace('module.', ''): v for k, v in checkpoint.items()}
24+
model.load_state_dict(ckpt)
25+
26+
return model
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Copyright (c) 2022 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
description: >-
16+
The "modnet-photographic-portrait-matting" model is a lightweight matting objective
17+
decomposition network (MODNet) for photographic portrait matting in real-time with
18+
a single input image with MobileNetV2 backbone. The model is pre-trained in PyTorch*
19+
framework and converted to ONNX* format.
20+
21+
More details provided in the paper <https://arxiv.org/abs/2011.11961> and repository
22+
<https://github.com/ZHKKKe/MODNet>.
23+
task_type: background_matting
24+
files:
25+
- name: modnet_onnx.py
26+
size: 9516
27+
checksum: 37326740f2756572c639bb8089c70cfee6994f358fb185fe5f1c1f40c061a5d73f62319a8c2aa48d367778360751385e
28+
source: https://raw.githubusercontent.com/ZHKKKe/MODNet/2938675e4b5c60ab5f5d7a2b2191c68256f99d70/onnx/modnet_onnx.py
29+
- name: src/models/backbones/__init__.py
30+
size: 277
31+
checksum: cdb28b27092889a8293e189e80351087d0cd3135d3f0b56a4a4fd5ae28251bd0142e027d209cf63a24f6b6042255b384
32+
source: https://raw.githubusercontent.com/ZHKKKe/MODNet/2938675e4b5c60ab5f5d7a2b2191c68256f99d70/src/models/backbones/__init__.py
33+
- name: src/models/backbones/mobilenetv2.py
34+
size: 5588
35+
checksum: c100361a1b06a3751fd5b7720cb4a882153d62362421e51ef8ecccadf75acdcbd40376477739ac35f404a76388b01121
36+
source: https://raw.githubusercontent.com/ZHKKKe/MODNet/2938675e4b5c60ab5f5d7a2b2191c68256f99d70/src/models/backbones/mobilenetv2.py
37+
- name: src/models/backbones/wrapper.py
38+
size: 2610
39+
checksum: c627e513d6aca544c60fc7875c159e36563542c7d9a746e5b79c1a650118b6c8920561587061979d78993f12be906bbc
40+
source: https://raw.githubusercontent.com/ZHKKKe/MODNet/2938675e4b5c60ab5f5d7a2b2191c68256f99d70/src/models/backbones/wrapper.py
41+
- name: modnet_photographic_portrait_matting.ckpt
42+
size: 26255603
43+
checksum: 14fb9db68be32bbef0acd42d8925cf4750d2188c1bdd86399442a67f20f4a6f827f9ed1a56a3dc20258baaa004af9980
44+
original_source:
45+
$type: google_drive
46+
id: 1mcr7ALciuAsHCpLnrtG_eop5-EYhbCmz
47+
source: https://storage.openvinotoolkit.org/repositories/open_model_zoo/public/2022.2/modnet-photographic-portrait-matting/modnet_photographic_portrait_matting.ckpt
48+
conversion_to_onnx_args:
49+
- --model-path=$dl_dir
50+
- --model-path=$config_dir
51+
- --model-name=create_modnet
52+
- --import-module=model
53+
- --model-param=weights=r"$dl_dir/modnet_photographic_portrait_matting.ckpt"
54+
- --input-shape=1,3,512,512
55+
- --input-names=input
56+
- --output-names=output
57+
- --output-file=$conv_dir/modnet_photographic_portrait_matting.onnx
58+
input_info:
59+
- name: input
60+
shape: [1, 3, 512, 512]
61+
layout: NCHW
62+
model_optimizer_args:
63+
- --input_model=$conv_dir/modnet_photographic_portrait_matting.onnx
64+
- --scale_values=input[127.5]
65+
- --mean_values=input[127.5, 127.5, 127.5]
66+
- --reverse_input_channels
67+
- --output=output
68+
framework: pytorch
69+
license: https://raw.githubusercontent.com/ZHKKKe/MODNet/master/LICENSE

0 commit comments

Comments
 (0)