Skip to content

Commit 9c4bccc

Browse files
Update README.md (#1068)
* update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update ipex readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update ipex notebook Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update README.md * Update notebooks/ipex/text_generation.ipynb --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
1 parent 09d080f commit 9c4bccc

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
🤗 Optimum Intel is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures.
88

9-
[Intel Extension for PyTorch](https://intel.github.io/intel-extension-for-pytorch/#introduction) is an open-source library which provides optimizations for both eager mode and graph mode, however, compared to eager mode, graph mode in PyTorch* normally yields better performance from optimization techniques, such as operation fusion.
9+
[Intel Extension for PyTorch](https://intel.github.io/intel-extension-for-pytorch/#introduction) is an open-source library which provides optimizations like faster attention and operators fusion.
1010

1111
Intel [Neural Compressor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html) is an open-source library enabling the usage of the most popular compression techniques such as quantization, pruning and knowledge distillation. It supports automatic accuracy-driven tuning strategies in order for users to easily generate quantized model. The users can easily apply static, dynamic and aware-training quantization approaches while giving an expected accuracy criteria. It also supports different weight pruning techniques enabling the creation of pruned model giving a predefined sparsity target.
1212

@@ -159,7 +159,7 @@ optimized_model = OVModelForSequenceClassification.from_pretrained(save_dir)
159159

160160

161161
## IPEX
162-
To load your IPEX model, you can just replace your `AutoModelForXxx` class with the corresponding `IPEXModelForXxx` class. You can set `export=True` to load a PyTorch checkpoint, export your model via TorchScript and apply IPEX optimizations : both operators optimization (replaced with customized IPEX operators) and graph-level optimization (like operators fusion) will be applied on your model.
162+
To load your IPEX model, you can just replace your `AutoModelForXxx` class with the corresponding `IPEXModelForXxx` class. It will load a PyTorch checkpoint, and apply IPEX operators optimization (replaced with customized IPEX operators).
163163
```diff
164164
from transformers import AutoTokenizer, pipeline
165165
- from transformers import AutoModelForCausalLM

notebooks/ipex/text_generation.ipynb

+2-2
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
"cell_type": "markdown",
1212
"metadata": {},
1313
"source": [
14-
"To load your IPEX model, you can just replace your `AutoModelForXxx` class with the corresponding `IPEXModelForXxx` class. You can set `export=True` to load a PyTorch checkpoint, export your model via TorchScript and apply IPEX optimizations : both operators optimization (replaced with customized IPEX operators) and graph-level optimization (like operators fusion) will be applied on your model."
14+
"To load your IPEX model, you can just replace your `AutoModelForXxx` class with the corresponding `IPEXModelForXxx` class. It could apply IPEX, providing optimizations like faster attention and operators fusion."
1515
]
1616
},
1717
{
@@ -60,7 +60,7 @@
6060
}
6161
],
6262
"source": [
63-
"model = IPEXModelForCausalLM.from_pretrained(\"gpt2\", torch_dtype=torch.bfloat16, export=True)\n",
63+
"model = IPEXModelForCausalLM.from_pretrained(\"gpt2\", torch_dtype=torch.bfloat16)\n",
6464
"tokenizer = AutoTokenizer.from_pretrained(\"gpt2\")\n",
6565
"input_sentence = [\"Answer the following yes/no question by reasoning step-by-step please. Can you write a whole Haiku in a single tweet?\"]\n",
6666
"model_inputs = tokenizer(input_sentence, return_tensors=\"pt\")\n",

0 commit comments

Comments
 (0)