Skip to content

Commit 4871bba

Browse files
committed
add ipex readme
1 parent 72b0630 commit 4871bba

File tree

1 file changed

+28
-0
lines changed

1 file changed

+28
-0
lines changed

README.md

+28
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,34 @@ where `extras` can be one or more of `ipex`, `neural-compressor`, `openvino`, `n
4444

4545
# Quick tour
4646

47+
## IPEX
48+
Below are examples of how to use IPEX model to generate texts.
49+
### generate
50+
```diff
51+
import torch
52+
from transformers import AutoTokenizer, AutoConfig
53+
- from transformers import AutoModelForCausalLM
54+
+ from optimum.intel.ipex import IPEXModelForCausalLM
55+
56+
config = AutoConfig.from_pretrained("gpt2")
57+
model = IPEXModelForCausalLM.from_pretrained(
58+
"gpt2",
59+
config=config,
60+
torch_dtype=torch.bfloat16,
61+
export=True,
62+
)
63+
tokenizer = AutoTokenizer.from_pretrained("gpt2")
64+
input_sentence = ["Answer the following yes/no question by reasoning step-by-step please. Can you write a whole Haiku in a single tweet?"]
65+
model_inputs = tokenizer(input_sentence, return_tensors="pt")
66+
generation_kwargs = dict(max_new_tokens=32, do_sample=False, num_beams=4, num_beam_groups=1, no_repeat_ngram_size=2, use_cache=True)
67+
68+
generated_ids = model.generate(**model_inputs, **generation_kwargs)
69+
output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
70+
print(output)
71+
```
72+
73+
For more details, please refer to the [documentation](https://intel.github.io/intel-extension-for-pytorch/#introduction).
74+
4775
## Neural Compressor
4876

4977
Dynamic quantization can be used through the Optimum command-line interface:

0 commit comments

Comments
 (0)