Skip to content

Commit 57e14de

Browse files
authored
Add llm-lora.ipynb notebook (#2847)
[CVS-164578](https://jira.devtools.intel.com/browse/CVS-164578)
1 parent 48c7af9 commit 57e14de

File tree

3 files changed

+615
-0
lines changed

3 files changed

+615
-0
lines changed

.ci/spellcheck/.pyspelling.wordlist.txt

+8
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Agentic
1717
agentic
1818
ai
1919
al
20+
AdapterConfig
2021
AISE
2122
AISEClassification
2223
AISEDetection
@@ -266,6 +267,7 @@ english
266267
ENLSP
267268
enum
268269
et
270+
emilykang
269271
Evol
270272
EVS
271273
eXplainable
@@ -410,6 +412,7 @@ IRs
410412
iteratively
411413
JAX
412414
JAX's
415+
Javascript
413416
JFLEG
414417
JIT
415418
Jina
@@ -496,6 +499,7 @@ logits
496499
LogSoftmax
497500
LoRA
498501
LoRAs
502+
lora
499503
lraspp
500504
LRASPP
501505
LTS
@@ -572,6 +576,7 @@ mpt
572576
MPT
573577
MRPC
574578
mRoPE
579+
medprob
575580
MTVQA
576581
multiarchitecture
577582
Multiclass
@@ -954,6 +959,7 @@ Swin
954959
SwiGLU
955960
SwinV
956961
sym
962+
snshrivas
957963
TaskManager
958964
TartanAir
959965
tbb
@@ -1002,6 +1008,8 @@ tunable
10021008
tv
10031009
TwoStreamInterleaveTransformer
10041010
TypeScript
1011+
tinyllama
1012+
TinyLLama
10051013
Udnie
10061014
UHD
10071015
UI

notebooks/llm-lora/README.md

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Text Generation with LoRA via OpenVINO GenAI
2+
3+
LoRA, or [Low-Rank Adaptation](https://arxiv.org/abs/2106.09685), is a popular and lightweight training technique used for fine-tuning Large Language and Stable Diffusion Models without needing full model training. Full fine-tuning of larger models (consisting of billions of parameters) is inherently expensive and time-consuming. LoRA works by adding a smaller number of new weights to the model for training, rather than retraining the entire parameter space of the model. This makes training with LoRA much faster, memory-efficient, and produces smaller model weights (a few hundred MBs), which are easier to store and share.
4+
5+
At its core, LoRA leverages the concept of low-rank matrix factorization. Instead of updating all the parameters in a neural network, LoRA decomposes the parameter space into two low-rank matrices. This decomposition allows the model to capture essential information with fewer parameters, significantly reducing the amount of data and computation required for fine-tuning. This vastly reduces the storage requirement for large language models adapted to specific tasks and enables efficient task-switching during deployment all without introducing inference latency.
6+
7+
![](https://github.com/user-attachments/assets/bf823c71-13b4-402c-a7b4-d6fc30a60d88)
8+
9+
Some more advantages of using LoRA:
10+
11+
* LoRA makes fine-tuning more efficient by drastically reducing the number of trainable parameters.
12+
* The original pre-trained weights are kept frozen, which means you can have multiple lightweight and portable LoRA models for various downstream tasks built on top of them.
13+
* LoRA is orthogonal to many other parameter-efficient methods and can be combined with many of them.
14+
* Performance of models fine-tuned using LoRA is comparable to the performance of fully fine-tuned models.
15+
* LoRA does not add any inference latency because adapter weights can be merged with the base model.
16+
17+
More details about LoRA can be found in HuggingFace [conceptual guide](https://huggingface.co/docs/peft/conceptual_guides/lora) and [blog post](https://huggingface.co/blog/peft).
18+
19+
In this tutorial we explore possibilities to use LoRA with OpenVINO Generative API.
20+
21+
## Notebook Contents
22+
23+
This notebook demonstrates how to perform generate using OpenVINO GenAI and LoRA adapters.
24+
25+
The tutorial consists of following steps:
26+
- Load and configure LoRA adapters
27+
- Run inference with OpenVINO GenAI LLMPipeline
28+
29+
## Installation Instructions
30+
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
31+
For details, please refer to [Installation Guide](../../README.md).
32+
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/llm-lora/README.md" />

0 commit comments

Comments
 (0)