Skip to content

Commit cb7c5b2

Browse files
authored
Support OVStableDiffusionPipelineBase to Load Textual Inversion in Runtime (#400)
* Enable OVStableDiffusionPipelineBase to load textual inversion embeddings in runtime * Move OVTextualInversionLoaderMixin in loaders.py * Reformat with black * Fix format via make style * Fix notebook format via make style * Add docs for textual inversion
1 parent 54d5d79 commit cb7c5b2

File tree

6 files changed

+498
-4
lines changed

6 files changed

+498
-4
lines changed

docs/source/inference.mdx

+79
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,44 @@ In case you want to change any parameters such as the outputs height or width, y
208208
<img src="https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/stable_diffusion_v1_5_sail_boat_rembrandt.png">
209209
</div>
210210

211+
### Text-to-Image with Textual Inversion
212+
Here is an example of how you can load an OpenVINO Stable Diffusion model with pre-trained textual inversion embeddings and run inference using OpenVINO Runtime:
213+
214+
215+
First, you can run original pipeline without textual inversion
216+
```python
217+
from optimum.intel import OVStableDiffusionPipeline
218+
import numpy as np
219+
220+
model_id = "echarlaix/stable-diffusion-v1-5-openvino"
221+
prompt = "A <cat-toy> back-pack"
222+
# Set a random seed for better comparison
223+
np.random.seed(42)
224+
225+
pipeline = OVStableDiffusionPipeline.from_pretrained(model_id, export=False, compile=False)
226+
pipeline.compile()
227+
image1 = pipeline(prompt, num_inference_steps=50).images[0]
228+
image1.save("stable_diffusion_v1_5_without_textual_inversion.png")
229+
```
230+
231+
Then, you can load [sd-concepts-library/cat-toy](https://huggingface.co/sd-concepts-library/cat-toy) textual inversion embedding and run pipeline with same prompt again
232+
```python
233+
# Reset stable diffusion pipeline
234+
pipeline.clear_requests()
235+
236+
# Load textual inversion into stable diffusion pipeline
237+
pipeline.load_textual_inversion("sd-concepts-library/cat-toy", "<cat-toy>")
238+
239+
# Compile the model before the first inference
240+
pipeline.compile()
241+
image2 = pipeline(prompt, num_inference_steps=50).images[0]
242+
image2.save("stable_diffusion_v1_5_with_textual_inversion.png")
243+
```
244+
The left image shows the generation result of original stable diffusion v1.5, the right image shows the generation result of stable diffusion v1.5 with textual inversion.
245+
| | |
246+
|---|---|
247+
| ![](https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/textual_inversion/stable_diffusion_v1_5_without_textual_inversion.png) | ![](https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/textual_inversion/stable_diffusion_v1_5_with_textual_inversion.png) |
248+
211249

212250
### Image-to-Image
213251

@@ -257,6 +295,47 @@ image.save("train_station.png")
257295
|---|---|
258296
| ![](https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/sd_xl/train_station_friedrich.png) | ![](https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/sd_xl/train_station_friedrich_2.png) |
259297

298+
### Text-to-Image with Textual Inversion
299+
300+
Here is an example of how you can load an SDXL OpenVINO model from [stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) with pre-trained textual inversion embeddings and run inference using OpenVINO Runtime:
301+
302+
303+
First, you can run original pipeline without textual inversion
304+
```python
305+
from optimum.intel import OVStableDiffusionXLPipeline
306+
import numpy as np
307+
308+
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
309+
prompt = "charturnerv2, multiple views of the same character in the same outfit, a character turnaround of a beautiful woman wearing a black jacket and red shirt, best quality, intricate details."
310+
# Set a random seed for better comparison
311+
np.random.seed(0)
312+
313+
base = OVStableDiffusionXLPipeline.from_pretrained(model_id, export=False, compile=False)
314+
base.compile()
315+
image1 = base(prompt, num_inference_steps=50).images[0]
316+
image1.save("sdxl_without_textual_inversion.png")
317+
```
318+
319+
Then, you can load [charturnerv2](https://civitai.com/models/3036/charturner-character-turnaround-helper-for-15-and-21) textual inversion embedding and run pipeline with same prompt again
320+
```python
321+
# Reset stable diffusion pipeline
322+
base.clear_requests()
323+
324+
# Load textual inversion into stable diffusion pipeline
325+
base.load_textual_inversion("./charturnerv2.pt", "charturnerv2")
326+
327+
# Compile the model before the first inference
328+
base.compile()
329+
image2 = base(prompt, num_inference_steps=50).images[0]
330+
image2.save("sdxl_with_textual_inversion.png")
331+
332+
The left image shows the generation result of the original SDXL base 1.0, the right image shows the generation result of SDXL base 1.0 with textual inversion.
333+
```
334+
335+
| | |
336+
|---|---|
337+
| ![](https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/textual_inversion/sdxl_without_textual_inversion.png) | ![](https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/textual_inversion/sdxl_with_textual_inversion.png) |
338+
260339

261340
### Image-to-Image
262341

notebooks/openvino/optimum_openvino_inference.ipynb

+3-1
Original file line numberDiff line numberDiff line change
@@ -344,7 +344,9 @@
344344
"from optimum.intel.openvino import OVModelForQuestionAnswering\n",
345345
"from transformers import AutoTokenizer, pipeline\n",
346346
"\n",
347-
"model = OVModelForQuestionAnswering.from_pretrained(\"helenai/distilbert-base-uncased-distilled-squad-ov-fp32\", compile=False)\n",
347+
"model = OVModelForQuestionAnswering.from_pretrained(\n",
348+
" \"helenai/distilbert-base-uncased-distilled-squad-ov-fp32\", compile=False\n",
349+
")\n",
348350
"tokenizer = AutoTokenizer.from_pretrained(\"helenai/distilbert-base-uncased-distilled-squad-ov-fp32\")\n",
349351
"\n",
350352
"max_length = 128\n",

notebooks/openvino/stable_diffusion_optimization.ipynb

+6-2
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,9 @@
6969
"metadata": {},
7070
"outputs": [],
7171
"source": [
72-
"quantized_pipe = OVStableDiffusionPipeline.from_pretrained(\"OpenVINO/Stable-Diffusion-Pokemon-en-quantized\", compile=False)\n",
72+
"quantized_pipe = OVStableDiffusionPipeline.from_pretrained(\n",
73+
" \"OpenVINO/Stable-Diffusion-Pokemon-en-quantized\", compile=False\n",
74+
")\n",
7375
"quantized_pipe.reshape(batch_size=1, height=512, width=512, num_images_per_prompt=1)\n",
7476
"quantized_pipe.compile()"
7577
]
@@ -102,7 +104,9 @@
102104
"metadata": {},
103105
"outputs": [],
104106
"source": [
105-
"optimized_pipe = OVStableDiffusionPipeline.from_pretrained(\"OpenVINO/stable-diffusion-pokemons-tome-quantized\", compile=False)\n",
107+
"optimized_pipe = OVStableDiffusionPipeline.from_pretrained(\n",
108+
" \"OpenVINO/stable-diffusion-pokemons-tome-quantized\", compile=False\n",
109+
")\n",
106110
"optimized_pipe.reshape(batch_size=1, height=512, width=512, num_images_per_prompt=1)\n",
107111
"optimized_pipe.compile()"
108112
]

0 commit comments

Comments
 (0)