Skip to content

Commit 5f504af

Browse files
Added Token Merging method for OpenVINO (#669)
* Added Token Merging for OpenVINO * Updated main README * Update modules/token_merging/README.md Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com> * Added tests for TokenMerging * Fixed setup.pu issue * Fixed issues * Removed .pyc file * Added license --------- Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
1 parent 7df7228 commit 5f504af

14 files changed

+1302
-0
lines changed

.github/workflows/token_merging.yml

+39
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
name: Token Merging - Test
2+
3+
on:
4+
push:
5+
branches: [ master ]
6+
pull_request:
7+
branches: [ master ]
8+
9+
concurrency:
10+
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
11+
cancel-in-progress: true
12+
13+
jobs:
14+
Precommit:
15+
strategy:
16+
fail-fast: false
17+
matrix:
18+
python-version: [3.8]
19+
20+
runs-on: ubuntu-latest
21+
steps:
22+
- uses: actions/checkout@v2
23+
- name: Setup Python ${{ matrix.python-version }}
24+
uses: actions/setup-python@v2
25+
with:
26+
python-version: ${{ matrix.python-version }}
27+
- name: Create and start a virtual environment
28+
run: |
29+
python -m venv venv
30+
source venv/bin/activate
31+
- name: Install dependencies
32+
run: |
33+
source venv/bin/activate
34+
python -m pip install --upgrade pip
35+
pip install modules/token_merging/[tests]
36+
- name: Run test
37+
run: |
38+
source venv/bin/activate
39+
python -m pytest modules/token_merging/tests/

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@
33

44
**/*.png
55
**/*.jar
6+
7+
__pycache__/

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ This list gives an overview of all modules available inside the contrib reposito
1313
* [**java_api**](./modules/java_api): Inference Engine Java API -- provides Java wrappers for Inference Engine public API.
1414
* [**Azure Video Analyzer**](./modules/ovms_ai_extension/): Azure Video Analyzer Extension -- enables exchange of video frames and inference results between [Azure Video Analyzer (AVA)](https://docs.microsoft.com/en-us/azure/azure-video-analyzer/video-analyzer-docs/overview) and OpenVINO™ Model Server.
1515
* [**custom_operations**](./modules/custom_operations/): Collection of Custom Operations -- implement Custom Operations with OpenVINO Extensibility Mechanism.
16+
* [**Token Merging**](./modules/token_merging/): adaptation of [Token Merging method](https://arxiv.org/abs/2210.09461) for OpenVINO.
1617

1718
## How to build OpenVINO with extra modules
1819
You can build OpenVINO, so it will include the modules from this repository. Contrib modules are under constant development and it is recommended to use them alongside the master branch or latest releases of OpenVINO.

modules/token_merging/README.md

+61
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Token Merging for Stable Diffusion running with OpenVINO
2+
3+
This is an OpenVINO adopted version of Token Merging method. The method is applied to PyTorch model before exporting to OpenVINO representation. It can be also stacked with 8-bit quantization to achieve a higher inference speed.
4+
The repository contains implementation for:
5+
- Stable Diffusion (HF Diffusers based models), see [example](https://github.com/huggingface/optimum-intel/tree/main/examples/openvino/stable-diffusion).
6+
- OpenCLIP, see [example](https://github.com/AlexKoff88/open_clip/blob/openvino_alt/tutorials/openvino/openvino_tome.ipynb).
7+
- Timm
8+
9+
10+
Here are the results for 100 iteration of 512x512 image generation on CPU.
11+
![ToMe for SD applied on a 512x512 image.](examples/assets/tome_results.png)
12+
13+
This is the official implementation of **ToMe for SD** from the paper:
14+
**[Token Merging for Fast Stable Diffusion](https://arxiv.org/abs/2303.17604)**
15+
16+
ToMe for SD is an extension of the original **ToMe**:
17+
**[Token Merging: Your ViT but Faster](https://arxiv.org/abs/2210.09461)**
18+
19+
20+
**Note:** This also supports most downstream UIs that use these repositories.
21+
22+
## Installation
23+
24+
ToMe for SD requires ``pytorch >= 1.12.1`` (for `scatter_reduce`), which you can get from [here](https://pytorch.org/get-started/locally/). Then after installing your choice of stable diffusion environment ([supported environments](#supported-environments)), use the corresponding python environment to install ToMe for SD:
25+
26+
```bash
27+
pip install git+https://github.com/openvinotoolkit/openvino_contrib.git#egg=tomeov&subdirectory=modules/token_merging
28+
```
29+
30+
## Usage
31+
* Diffusers:
32+
```py
33+
import torch, tomeov
34+
from diffusers import StableDiffusionPipeline
35+
36+
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
37+
38+
save_dir = "stable_diffusion_optimized"
39+
# Apply ToMe with a 30% merging ratio
40+
tomeov.patch_stable_diffusion(pipe, ratio=0.3) # Can also use pipe.unet in place of pipe here
41+
```
42+
* OpenCLIP:
43+
```py
44+
import torch, tomeov
45+
import open_clip
46+
from open_clip import tokenizer
47+
48+
model, _, preprocess = open_clip.create_model_and_transforms("ViT-B-16-plus-240", pretrained="laion400m_e32")
49+
50+
tomeov.patch_openclip(model, 8) # 8 - number of tokens merged in each MHSA from top down
51+
```
52+
* Timm:
53+
```py
54+
import torch, tomeov
55+
import timm
56+
57+
model_name = 'vit_tiny_patch16_224'
58+
model = timm.create_model(model_name, pretrained=True)
59+
60+
tomeov.patch_timm(model, 4) # 8 - number of tokens merged in each MHSA from top down
61+
```

modules/token_merging/demo.ipynb

+131
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
{
2+
"cells": [
3+
{
4+
"attachments": {},
5+
"cell_type": "markdown",
6+
"metadata": {},
7+
"source": [
8+
"## Token Merging for Stable Diffusion running with OpenVINO demo\n",
9+
"This notebook demonstrates how to use Token Merging method to accelerate Stable Diffusion model running with OpenVINO. The method is applied to PyTorch model before exporting to OpenVINO representation."
10+
]
11+
},
12+
{
13+
"cell_type": "code",
14+
"execution_count": null,
15+
"metadata": {},
16+
"outputs": [],
17+
"source": [
18+
"import tomeov\n",
19+
"from diffusers import StableDiffusionPipeline, DDPMScheduler\n",
20+
"from diffusers.training_utils import set_seed\n",
21+
"from optimum.intel.openvino import OVStableDiffusionPipeline\n",
22+
"from IPython.display import display"
23+
]
24+
},
25+
{
26+
"cell_type": "code",
27+
"execution_count": null,
28+
"metadata": {},
29+
"outputs": [],
30+
"source": [
31+
"scheduler = DDPMScheduler(beta_start=0.00085, beta_end=0.012,\n",
32+
" beta_schedule=\"scaled_linear\", num_train_timesteps=1000)\n",
33+
"pipe = StableDiffusionPipeline.from_pretrained(\"runwayml/stable-diffusion-v1-5\", scheduler=scheduler)\n",
34+
"pipe.safety_checker = lambda images, clip_input: (images, False)\n"
35+
]
36+
},
37+
{
38+
"attachments": {},
39+
"cell_type": "markdown",
40+
"metadata": {},
41+
"source": [
42+
"* Create a pipiline with Token Merging applied to a Stable Diffusion model and export it to OpenVINO representation."
43+
]
44+
},
45+
{
46+
"cell_type": "code",
47+
"execution_count": null,
48+
"metadata": {},
49+
"outputs": [],
50+
"source": [
51+
"# Apply ToMe with a 30% merging ratio\n",
52+
"tomeov.patch_stable_diffusion(pipe, ratio=0.3) # Can also use pipe.unet in place of pipe here"
53+
]
54+
},
55+
{
56+
"cell_type": "code",
57+
"execution_count": null,
58+
"metadata": {},
59+
"outputs": [],
60+
"source": [
61+
"save_dir = \"stable_diffusion_optimized\"\n",
62+
"tomeov.export_diffusion_pipeline(pipe, save_dir)"
63+
]
64+
},
65+
{
66+
"attachments": {},
67+
"cell_type": "markdown",
68+
"metadata": {},
69+
"source": [
70+
"* Create OpenVINO-based pipeline. We fix image size for faster inference."
71+
]
72+
},
73+
{
74+
"cell_type": "code",
75+
"execution_count": null,
76+
"metadata": {},
77+
"outputs": [],
78+
"source": [
79+
"set_seed(42)\n",
80+
"ov_pipe = OVStableDiffusionPipeline.from_pretrained(save_dir, compile=False)\n",
81+
"ov_pipe.reshape(batch_size=1, height=512, width=512, num_images_per_prompt=1)\n",
82+
"ov_pipe.compile()"
83+
]
84+
},
85+
{
86+
"attachments": {},
87+
"cell_type": "markdown",
88+
"metadata": {},
89+
"source": [
90+
"* Generate and display the image."
91+
]
92+
},
93+
{
94+
"cell_type": "code",
95+
"execution_count": null,
96+
"metadata": {},
97+
"outputs": [],
98+
"source": [
99+
"set_seed(42)\n",
100+
"output = ov_pipe(prompt, num_inference_steps=50, output_type=\"pil\")\n",
101+
"display(output.images[0])"
102+
]
103+
}
104+
],
105+
"metadata": {
106+
"kernelspec": {
107+
"display_name": "Python 3.8.10 ('stable_diffusion')",
108+
"language": "python",
109+
"name": "python3"
110+
},
111+
"language_info": {
112+
"codemirror_mode": {
113+
"name": "ipython",
114+
"version": 3
115+
},
116+
"file_extension": ".py",
117+
"mimetype": "text/x-python",
118+
"name": "python",
119+
"nbconvert_exporter": "python",
120+
"pygments_lexer": "ipython3",
121+
"version": "3.8.10"
122+
},
123+
"vscode": {
124+
"interpreter": {
125+
"hash": "7918409a64d3d4275e0103fc4443d9be5863d1df136c02ed032407c7ae821339"
126+
}
127+
}
128+
},
129+
"nbformat": 4,
130+
"nbformat_minor": 2
131+
}

modules/token_merging/setup.py

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Copyright (C) 2018-2022 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from setuptools import find_packages, setup
5+
6+
EXTRAS_REQUIRE = {
7+
"tests": ["onnx", "onnxruntime", "accelerate", "diffusers", "openvino", "optimum", "optimum-intel", "open-clip-torch","timm", "pytest"],
8+
}
9+
10+
setup(
11+
name="tomeov",
12+
version="0.1.0",
13+
author="Alexander Kozlov",
14+
url="https://github.com/openvinotoolkit/openvino_contrib/tree/master/modules/token_merging",
15+
description="Token Merging for OpenVINO",
16+
install_requires=["torch~=1.13.1", "torchvision~=0.14.1"],
17+
dependency_links=["https://download.pytorch.org/whl/cpu"],
18+
extras_require=EXTRAS_REQUIRE,
19+
packages=find_packages(exclude=("examples", "build")),
20+
license = 'Apache 2.0',
21+
long_description=open("README.md", "r", encoding="utf-8").read(),
22+
long_description_content_type="text/markdown",
23+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# Copyright (C) 2018-2022 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
import tempfile
5+
import unittest
6+
import os
7+
8+
import numpy as np
9+
from PIL import Image
10+
import torch
11+
import openvino.runtime as ov
12+
13+
import tomeov
14+
from diffusers import StableDiffusionPipeline, DDPMScheduler
15+
from optimum.intel.openvino import OVStableDiffusionPipeline
16+
import open_clip
17+
import timm
18+
19+
20+
class TokenMergingIntegrationTest(unittest.TestCase):
21+
def __init__(self, *args, **kwargs):
22+
super().__init__(*args, **kwargs)
23+
self.OV_DIFFUSION_MODEL_ID = "hf-internal-testing/tiny-stable-diffusion-torch"
24+
self.OPENCLIP_MODEL = ("ViT-B-32", "laion400m_e32")
25+
self.TIMM_MODEL = "vit_tiny_patch16_224"
26+
27+
def test_stable_diffusion(self):
28+
loaded_pipeline = StableDiffusionPipeline.from_pretrained(self.OV_DIFFUSION_MODEL_ID)
29+
prompt = "sailing ship in storm by Leonardo da Vinci"
30+
height = 128
31+
width = 128
32+
33+
tomeov.patch_stable_diffusion(loaded_pipeline, ratio=0.3)
34+
35+
with tempfile.TemporaryDirectory() as tmpdirname:
36+
tomeov.export_diffusion_pipeline(loaded_pipeline, tmpdirname)
37+
ov_pipe = OVStableDiffusionPipeline.from_pretrained(tmpdirname, compile=False)
38+
ov_pipe.reshape(batch_size=1, height=height, width=width, num_images_per_prompt=1)
39+
ov_pipe.compile()
40+
ov_pipe(prompt, num_inference_steps=1, height=height, width=width, output_type="np").images
41+
42+
def test_openclip(self):
43+
model, _, transform = open_clip.create_model_and_transforms(self.OPENCLIP_MODEL[0], pretrained=self.OPENCLIP_MODEL[1])
44+
tomeov.patch_openclip(model, 8)
45+
dummy_image = np.random.rand(100, 100, 3) * 255
46+
dummy_image = Image.fromarray(dummy_image.astype("uint8"))
47+
dummy_image = transform(dummy_image).unsqueeze(0)
48+
49+
with tempfile.TemporaryDirectory(suffix = ".onnx") as tmpdirname:
50+
model_file = os.path.join(tmpdirname, "image_encoder.onnx")
51+
torch.onnx.export(
52+
model.visual,
53+
dummy_image,
54+
model_file,
55+
opset_version=14,
56+
input_names=["image"],
57+
output_names=["image_embedding"],
58+
dynamic_axes={
59+
"image": {0: "batch"},
60+
"image_embedding": {0: "batch"},
61+
}
62+
)
63+
compiled_model = ov.compile_model(model_file)
64+
self.assertTrue(compiled_model)
65+
66+
def test_timm(self):
67+
model = timm.create_model(self.TIMM_MODEL, pretrained=False)
68+
69+
tomeov.patch_timm(model, 4) # 8 - number of tokens merged in each MHSA from top down
70+
71+
dummy_image = torch.rand(1, 3, 224, 224)
72+
73+
with tempfile.TemporaryDirectory(suffix = ".onnx") as tmpdirname:
74+
model_file = os.path.join(tmpdirname, "model.onnx")
75+
torch.onnx.export(
76+
model,
77+
dummy_image,
78+
model_file,
79+
opset_version=14,
80+
input_names=["image"],
81+
output_names=["output"],
82+
dynamic_axes={
83+
"image": {0: "batch"},
84+
"output": {0: "batch"},
85+
}
86+
)
87+
compiled_model = ov.compile_model(model_file)
88+
self.assertTrue(compiled_model)
89+
+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Copyright (C) 2018-2022 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from .import_utils import (
5+
is_diffusers_available,
6+
is_openclip_available,
7+
is_timm_available,
8+
)
9+
10+
__all__ = []
11+
12+
if is_diffusers_available():
13+
from .stable_diffusion import patch_stable_diffusion
14+
from .utils import export_diffusion_pipeline
15+
__all__ += ["patch_stable_diffusion", "export_diffusion_pipeline"]
16+
17+
if is_openclip_available():
18+
from .openclip import patch_openclip
19+
__all__ += ["patch_openclip"]
20+
21+
if is_timm_available():
22+
from .timm import patch_timm
23+
__all__ += ["patch_timm"]

0 commit comments

Comments
 (0)