You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: notebooks/image-to-image-genai/README.md
+6-6
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ Image-to-image is the task of transforming an input image through a variety of p
5
5
One of the most popular use cases of image-to-image is style transfer. With style transfer models:
6
6
* a regular photo can be transformed into a variety of artistic styles or genres, such as a watercolor painting, a comic book illustration and more.
7
7
* new images can be generated using a text prompt, in the style of a reference input image.
8
-
8
+
9
9
Latent diffusion models can be used for performing image-to-image generation. Diffusion-based Image-to-image is similar to [text-to-image](../text-to-image-genai/text-to-image-genai.ipynb), but in addition to a prompt, you can also pass an initial image as a starting point for the diffusion process. The initial image is encoded to latent space and noise is added to it. Then the latent diffusion model takes a prompt and the noisy latent image, predicts the added noise, and removes the predicted noise from the initial latent image to get the new latent image. Lastly, a decoder decodes the new latent image back into an image.
@@ -18,15 +18,15 @@ In this tutorial, we consider how to use OpenVINO GenAI for performing image-to-
18
18
19
19
This library is friendly to PC and laptop execution, and optimized for resource consumption. It requires no external dependencies to run generative models as it already includes all the core functionality (e.g. tokenization via openvino-tokenizers).
20
20
21
-
OpenVINO GenAI supports popular diffusion models like Stable Diffusion or SDXL for performing image generation. You can find supported models list in [OpenVINO GenAI documentation](https://github.com/openvinotoolkit/openvino.genai/blob/master/SUPPORTED_MODELS.md#image-generation-models). Previously, we considered how to run text-to-image generation with OpenVINO GenAI and apply multiple LoRA adapters, mow is image-to-image.
21
+
OpenVINO GenAI supports popular diffusion models like Stable Diffusion or SDXL for performing image generation. You can find supported models list in [OpenVINO GenAI documentation](https://github.com/openvinotoolkit/openvino.genai/blob/master/SUPPORTED_MODELS.md#image-generation-models). Previously, we considered how to run text-to-image generation with OpenVINO GenAI and apply multiple LoRA adapters, mow is image-to-image.
22
22
23
23
## Notebook Contents
24
24
25
-
In this notebook we will demonstrate how to use Latent Diffusion models like Stable Diffusion 1.5, 2.1, LCM, SDXL for image to image generation using OpenVINO GenAI Image2ImagePipeline.
26
-
All it takes is two steps:
25
+
In this notebook we will demonstrate how to use Latent Diffusion models like Stable Diffusion 1.5, 2.1, LCM, SDXL for image to image generation using OpenVINO GenAI Image2ImagePipeline.
26
+
All it takes is two steps:
27
27
1. Export OpenVINO IR format model using the [Hugging Face Optimum](https://huggingface.co/docs/optimum/installation) library accelerated by OpenVINO integration.
28
28
The Hugging Face Optimum Intel API is a high-level API that enables us to convert and quantize models from the Hugging Face Transformers library to the OpenVINO™ IR format. For more details, refer to the [Hugging Face Optimum Intel documentation](https://huggingface.co/docs/optimum/intel/inference).
29
-
1. Run inference using the standard [Image-to-Image Generation pipeline](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide.html) from OpenVINO GenAI.
29
+
1. Run inference using the standard [Image-to-Image Generation pipeline](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai.html) from OpenVINO GenAI.
30
30
31
31
The tutorial consists of following steps:
32
32
- Install prerequisites
@@ -38,7 +38,7 @@ The tutorial consists of following steps:
38
38
- Explore advanced options for generation results improvement
Copy file name to clipboardexpand all lines: notebooks/llm-agent-functioncall/llm-agent-functioncall-qwen.ipynb
+1-1
Original file line number
Diff line number
Diff line change
@@ -305,7 +305,7 @@
305
305
"id": "d70905e2",
306
306
"metadata": {},
307
307
"source": [
308
-
"You can get additional inference speed improvement with [Dynamic Quantization of activations and KV-cache quantization on CPU](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/llm-inference-hf.html#enabling-openvino-runtime-optimizations). These options can be enabled with `ov_config` as follows:"
308
+
"You can get additional inference speed improvement with [Dynamic Quantization of activations and KV-cache quantization on CPU](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-optimum-intel.html#enabling-openvino-runtime-optimizations). These options can be enabled with `ov_config` as follows:"
Copy file name to clipboardexpand all lines: notebooks/llm-agent-react/llm-agent-react-langchain.ipynb
+1-1
Original file line number
Diff line number
Diff line change
@@ -556,7 +556,7 @@
556
556
"id": "d70905e2",
557
557
"metadata": {},
558
558
"source": [
559
-
"You can get additional inference speed improvement with [Dynamic Quantization of activations and KV-cache quantization on CPU](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/llm-inference-hf.html#enabling-openvino-runtime-optimizations). These options can be enabled with `ov_config` as follows:"
559
+
"You can get additional inference speed improvement with [Dynamic Quantization of activations and KV-cache quantization on CPU](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-optimum-intel.html#enabling-openvino-runtime-optimizations). These options can be enabled with `ov_config` as follows:"
Copy file name to clipboardexpand all lines: notebooks/llm-agent-react/llm-agent-react.ipynb
+1-1
Original file line number
Diff line number
Diff line change
@@ -240,7 +240,7 @@
240
240
"Model class initialization starts with calling `from_pretrained` method. When downloading and converting Transformers model, the parameter `export=True` should be added (as we already converted model before, we do not need to provide this parameter). We can save the converted model for the next usage with the `save_pretrained` method.\n",
241
241
"Tokenizer class and pipelines API are compatible with Optimum models.\n",
242
242
"\n",
243
-
"You can find more details about OpenVINO LLM inference using HuggingFace Optimum API in [LLM inference guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html)."
243
+
"You can find more details about OpenVINO LLM inference using HuggingFace Optimum API in [LLM inference guide](https://docs.openvino.ai/2025/openvino-workflow-generative.html)."
Copy file name to clipboardexpand all lines: notebooks/llm-chatbot/llm-chatbot-generate-api.ipynb
+2-2
Original file line number
Diff line number
Diff line change
@@ -193,7 +193,7 @@
193
193
"* **INT8** is an 8-bit weight-only quantization provided by [NNCF](https://github.com/openvinotoolkit/nncf): This method compresses weights to an 8-bit integer data type, which balances model size reduction and accuracy, making it a versatile option for a broad range of applications.\n",
194
194
"* **INT4** is an 4-bit weight-only quantization provided by [NNCF](https://github.com/openvinotoolkit/nncf). involves quantizing weights to an unsigned 4-bit integer symmetrically around a fixed zero point of eight (i.e., the midpoint between zero and 15). in case of **symmetric quantization** or asymmetrically with a non-fixed zero point, in case of **asymmetric quantization** respectively. Compared to INT8 compression, INT4 compression improves performance even more, but introduces a minor drop in prediction quality. INT4 it ideal for situations where speed is prioritized over an acceptable trade-off against accuracy.\n",
195
195
"* **INT4 AWQ** is an 4-bit activation-aware weight quantization. [Activation-aware Weight Quantization](https://arxiv.org/abs/2306.00978) (AWQ) is an algorithm that tunes model weights for more accurate INT4 compression. It slightly improves generation quality of compressed LLMs, but requires significant additional time for tuning weights on a calibration dataset. We will use `wikitext-2-raw-v1/train` subset of the [Wikitext](https://huggingface.co/datasets/Salesforce/wikitext) dataset for calibration.\n",
196
-
"* **INT4 NPU-friendly** is an 4-bit channel-wise quantization. This approach is [recommended](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide-npu.html) for LLM inference using NPU.\n",
196
+
"* **INT4 NPU-friendly** is an 4-bit channel-wise quantization. This approach is [recommended](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai-on-npu.html) for LLM inference using NPU.\n",
197
197
"\n",
198
198
"<details>\n",
199
199
" <summary><b>Click here to see available models options</b></summary>\n",
@@ -624,7 +624,7 @@
624
624
"\n",
625
625
"The difference between chatbot and instruction-following pipelines is that the model should have \"memory\" to find correct answers on the chain of connected questions. OpenVINO GenAI uses `KVCache` representation for maintain a history of conversation. By default, `LLMPipeline` resets `KVCache` after each `generate` call. To keep conversational history, we should move LLMPipeline to chat mode using `start_chat()` method.\n",
626
626
"\n",
627
-
"More info about OpenVINO LLM inference can be found in [LLM Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html)\n",
627
+
"More info about OpenVINO LLM inference can be found in [LLM Inference Guide](https://docs.openvino.ai/2025/openvino-workflow-generative.html)\n",
Copy file name to clipboardexpand all lines: notebooks/llm-chatbot/llm-chatbot.ipynb
+3-3
Original file line number
Diff line number
Diff line change
@@ -904,7 +904,7 @@
904
904
"Model class initialization starts with calling `from_pretrained` method. When downloading and converting Transformers model, the parameter `export=True` should be added (as we already converted model before, we do not need to provide this parameter). We can save the converted model for the next usage with the `save_pretrained` method.\n",
905
905
"Tokenizer class and pipelines API are compatible with Optimum models.\n",
906
906
"\n",
907
-
"You can find more details about OpenVINO LLM inference using HuggingFace Optimum API in [LLM inference guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html)."
907
+
"You can find more details about OpenVINO LLM inference using HuggingFace Optimum API in [LLM inference guide](https://docs.openvino.ai/2025/openvino-workflow-generative.html)."
"As can be seen, the pipeline very similar to instruction-following with only changes that previous conversation history additionally passed as input with next user question for getting wider input context. On the first iteration, the user provided instructions joined to conversation history (if exists) converted to token ids using a tokenizer, then prepared input provided to the model. The model generates probabilities for all tokens in logits format The way the next token will be selected over predicted probabilities is driven by the selected decoding methodology. You can find more information about the most popular decoding methods in this [blog](https://huggingface.co/blog/how-to-generate). The result generation updates conversation history for next conversation step. it makes stronger connection of next question with previously provided and allows user to make clarifications regarding previously provided answers.https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html"
1013
+
"As can be seen, the pipeline very similar to instruction-following with only changes that previous conversation history additionally passed as input with next user question for getting wider input context. On the first iteration, the user provided instructions joined to conversation history (if exists) converted to token ids using a tokenizer, then prepared input provided to the model. The model generates probabilities for all tokens in logits format The way the next token will be selected over predicted probabilities is driven by the selected decoding methodology. You can find more information about the most popular decoding methods in this [blog](https://huggingface.co/blog/how-to-generate). The result generation updates conversation history for next conversation step. it makes stronger connection of next question with previously provided and allows user to make clarifications regarding previously provided answers.https://docs.openvino.ai/2025/openvino-workflow-generative.html"
1014
1014
]
1015
1015
},
1016
1016
{
@@ -1038,7 +1038,7 @@
1038
1038
" - **Medium top_p** (e.g., 0.8): The AI model considers tokens with a higher cumulative probability, such as \"playing,\"\"sleeping,\" and \"eating.\"\n",
1039
1039
" - **High top_p** (e.g., 1.0): The AI model considers all tokens, including those with lower probabilities, such as \"driving\" and \"flying.\"\n",
1040
1040
" * `Top-k` is an another popular sampling strategy. In comparison with Top-P, which chooses from the smallest possible set of words whose cumulative probability exceeds the probability P, in Top-K sampling K most likely next words are filtered and the probability mass is redistributed among only those K next words. In our example with cat, if k=3, then only \"playing\", \"sleeping\" and \"eating\" will be taken into account as possible next word.\n",
1041
-
" * `Repetition Penalty` This parameter can help penalize tokens based on how frequently they occur in the text, including the input prompt. A token that has already appeared five times is penalized more heavily than a token that has appeared only one time. A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens.https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html"
1041
+
" * `Repetition Penalty` This parameter can help penalize tokens based on how frequently they occur in the text, including the input prompt. A token that has already appeared five times is penalized more heavily than a token that has appeared only one time. A value of 1 means that there is no penalty and values larger than 1 discourage repeated tokens.https://docs.openvino.ai/2025/openvino-workflow-generative.html"
Copy file name to clipboardexpand all lines: notebooks/multilora-image-generation/multilora-image-generation.ipynb
+1-1
Original file line number
Diff line number
Diff line change
@@ -104,7 +104,7 @@
104
104
"LoRA can be easily added to [Diffusers pipeline](https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters#lora) before export. At the export stage, LoRA weights will be fused to original model weights and converted model will preserve LoRA provided behavior. This approach is suitable when you need model with adapter capabilities by default and it does not required configuration at inference time (e.g. changing weight coefficient for adapter).\n",
105
105
"For example, we can use this method for speedup generation process with integration [LCM LoRA](https://huggingface.co/blog/lcm_lora). Previously, we already considered with approach in this [tutorial](../latent-consistency-models-image-generation/lcm-lora-controlnet.ipynb).\n",
106
106
"\n",
107
-
"Using `optimum-cli` for exporting models requires to provide model id on HuggingFace Hub or local directory with saved model. In case, if model stored in multiple separated repositories or directories (e.g. you want to replace VAE component or add LoRA), it should be merged and saved on disk before export. For avoiding this, we will use `export_from_model` function that accepts initialized model. Additionally, for using model with OpenVINO GenAI, we need to export tokenizers to OpenVINO format using [OpenVINO Tokenizers](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/ov-tokenizers.html) library.\n",
107
+
"Using `optimum-cli` for exporting models requires to provide model id on HuggingFace Hub or local directory with saved model. In case, if model stored in multiple separated repositories or directories (e.g. you want to replace VAE component or add LoRA), it should be merged and saved on disk before export. For avoiding this, we will use `export_from_model` function that accepts initialized model. Additionally, for using model with OpenVINO GenAI, we need to export tokenizers to OpenVINO format using [OpenVINO Tokenizers](https://docs.openvino.ai/2025/openvino-workflow-generative/ov-tokenizers.html) library.\n",
108
108
"\n",
109
109
"In this tutorial we will use [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) model, but the same steps are also applicable to other models of Stable Diffusion family."
Copy file name to clipboardexpand all lines: notebooks/speculative-sampling/speculative-sampling.ipynb
+1-1
Original file line number
Diff line number
Diff line change
@@ -90,7 +90,7 @@
90
90
"As example, we will use already converted LLMs from [OpenVINO collection](https://huggingface.co/collections/OpenVINO/llm-6687aaa2abca3bbcec71a9bd).\n",
91
91
"You can find OpenVINO optimized FastDraft models can be found in this [collection](https://huggingface.co/collections/OpenVINO/speculative-decoding-draft-models-673f5d944d58b29ba6e94161). As example we will use [Phi-3-mini-4k-instruct-int4-ov](https://huggingface.co/OpenVINO/Phi-3-mini-4k-instruct-int4-ov) as target model and [Phi-3-mini-FastDraft-50M-int8-ov](https://huggingface.co/OpenVINO/Phi-3-mini-FastDraft-50M-int8-ov) as draft.\n",
92
92
"\n",
93
-
"In case, if you want run own models, you should convert them using [Hugging Face Optimum](https://huggingface.co/docs/optimum/intel/openvino/export) library accelerated by OpenVINO integration. More details about model preparation can be found in [OpenVINO LLM inference guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/llm-inference-native-ov.html#convert-hugging-face-tokenizer-and-model-to-openvino-ir-format)"
93
+
"In case, if you want run own models, you should convert them using [Hugging Face Optimum](https://huggingface.co/docs/optimum/intel/openvino/export) library accelerated by OpenVINO integration. More details about model preparation can be found in [OpenVINO LLM inference guide](https://docs.openvino.ai/2025/openvino-workflow-generative/genai-model-preparation.html)"
Copy file name to clipboardexpand all lines: notebooks/text-to-image-genai/README.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -8,10 +8,10 @@ In this tutorial we consider how to use OpenVINO GenAI for image generation scen
8
8
9
9
## Notebook Contents
10
10
11
-
In this notebook we will demonstrate how to use text to image models like Stable Diffusion 1.5, 2.1, LCM using [Dreamlike Anime 1.0](https://huggingface.co/dreamlike-art/dreamlike-anime-1.0) as an example. All it takes is two steps:
11
+
In this notebook we will demonstrate how to use text to image models like Stable Diffusion 1.5, 2.1, LCM using [Dreamlike Anime 1.0](https://huggingface.co/dreamlike-art/dreamlike-anime-1.0) as an example. All it takes is two steps:
12
12
1. Export OpenVINO IR format model using the [Hugging Face Optimum](https://huggingface.co/docs/optimum/installation) library accelerated by OpenVINO integration.
13
13
The Hugging Face Optimum Intel API is a high-level API that enables us to convert and quantize models from the Hugging Face Transformers library to the OpenVINO™ IR format. For more details, refer to the [Hugging Face Optimum Intel documentation](https://huggingface.co/docs/optimum/intel/inference).
14
-
2. Run inference using the standard [Text-to-Image Generation pipeline](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide.html) from OpenVINO GenAI.
14
+
2. Run inference using the standard [Text-to-Image Generation pipeline](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai.html) from OpenVINO GenAI.
Copy file name to clipboardexpand all lines: notebooks/text-to-image-genai/text-to-image-genai.ipynb
+1-1
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,7 @@
15
15
"In this notebook we will demonstrate how to use text to image models like Stable Diffusion 1.5, 2.1, LCM using [Dreamlike Anime 1.0](https://huggingface.co/dreamlike-art/dreamlike-anime-1.0) as an example. All it takes is two steps: \n",
16
16
"1. Export OpenVINO IR format model using the [Hugging Face Optimum](https://huggingface.co/docs/optimum/installation) library accelerated by OpenVINO integration.\n",
17
17
"The Hugging Face Optimum Intel API is a high-level API that enables us to convert and quantize models from the Hugging Face Transformers library to the OpenVINO™ IR format. For more details, refer to the [Hugging Face Optimum Intel documentation](https://huggingface.co/docs/optimum/intel/inference).\n",
18
-
"2. Run inference using the [Text-to-Image Generation pipeline](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide.html) from OpenVINO GenAI.\n",
18
+
"2. Run inference using the [Text-to-Image Generation pipeline](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai.html) from OpenVINO GenAI.\n",
0 commit comments