You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To simplify creation of a heterogenous stable diffusion txt2image
pipeline, this adds a new API to `Text2ImagePipeline` class:
```
/**
* Compiles image generation pipeline for given devices for text encoding, denoising, and vae decoding.
* @param text_encode_device A device to compile text encoder(s) with
* @param denoise_device A device to compile denoiser (e.g. UNet, SD3 Transformer, etc.) with
* @param vae_decode_device A device to compile VAE decoder(s) with
* @param properties A map of properties which affect models compilation
* @note If pipeline was compiled before, an exception is thrown.
*/
void compile(const std::string& text_encode_device,
const std::string& denoise_device,
const std::string& vae_decode_device,
const ov::AnyMap& properties = {});
```
(Need some feedback here.. especially on if we technically need 3 sets
of properties.. one per device?)
This API greatly simplifies heterogenous pipeline setup to this:
```
ov::genai::Text2ImagePipeline pipe(models_path);
pipe.reshape(1, width, height, pipe.get_generation_config().guidance_scale);
pipe.compile(text_encoder_device, unet_device, vae_decoder_device);
```
And so now with these changes, heterogenous stable diffusion sample can
support all variants of stable diffusion (SD1.5, LCM, XL, SD3, etc.)
with the same code. With the old method (creating sub-components and
assembling pipeline object), it would have been difficult to achieve
this.
With that said, this PR is tested and working with the following
pipelines (with NPU running denoise):
* SD1.5 / LCM
* SDXL
TODO:
* ~~Add python bindings for the new API~~
* ~~Update python heterogenous sample~~
**FUTURE WORK** (outside the scope of this PR):
* Add support for SD3 (this will be separate PR)
* In general, this requires fixes to this issue:
openvinotoolkit/openvino#29113
* Also some weirdness in current reshape() path I need to figure out.
* For NPU, this requires a 'batch 1' implementation for Transformer2D --
similar as we did for UNet.
* Add support for FLUX (this will be separate PR)
* Add equivalent API for IMAGE2IMAGE / INPAINTING (separate PR's)
---------
Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
0 commit comments