Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline #501

Merged
merged 37 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
1b89624
define optimum-intel pipeline
jiqing-feng Jan 8, 2024
2bf2122
add tests and readme
jiqing-feng Jan 8, 2024
db10723
fix pipelines example
jiqing-feng Jan 8, 2024
24f26db
fix readme codestyle
jiqing-feng Jan 9, 2024
8394d41
Merge branch 'huggingface:main' into pipeline
jiqing-feng Jan 9, 2024
39b7804
add _load_model in pipeline
jiqing-feng Jan 9, 2024
b0f21e9
Merge branch 'huggingface:main' into pipeline
jiqing-feng Mar 28, 2024
d37ff18
update pipeline for optimum intel
jiqing-feng Apr 2, 2024
6882417
update tests
jiqing-feng Apr 2, 2024
64c546c
remove readme
jiqing-feng Apr 2, 2024
4d69d40
Merge branch 'huggingface:main' into pipeline
jiqing-feng Apr 2, 2024
29ad8b2
Update optimum/intel/pipelines/__init__.py
jiqing-feng Apr 3, 2024
b5392c1
fix pipelines
jiqing-feng Apr 7, 2024
f294f74
add all supported tasks testing
jiqing-feng Apr 7, 2024
7510036
add hub_kwargs and model_kwargs on tokenizer and feature_extractor
jiqing-feng Apr 15, 2024
faba83f
Merge branch 'huggingface:main' into pipeline
jiqing-feng Apr 15, 2024
9e8ce0e
add hub_kwargs and default pipeline tests
jiqing-feng Apr 25, 2024
6056612
Merge branch 'huggingface:main' into pipeline
jiqing-feng Apr 28, 2024
5013fe7
fix _from_transformers args
jiqing-feng Apr 28, 2024
a39112f
rm default pipeline test
jiqing-feng Apr 29, 2024
f401b55
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
e784dd2
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
6fb8863
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
79ae3d9
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
cfbcf9f
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
3760e1e
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
112a9c2
Merge branch 'main' into pipeline
jiqing-feng May 6, 2024
6d4726b
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
4effaa4
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 6, 2024
bf2ae08
fix comments
jiqing-feng May 6, 2024
184a610
Update optimum/exporters/openvino/model_patcher.py
echarlaix May 14, 2024
abe8704
Update optimum/intel/ipex/modeling_base.py
jiqing-feng May 15, 2024
aa4d4e6
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 15, 2024
ea756b0
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 15, 2024
7f92191
Update optimum/intel/pipelines/pipeline_base.py
jiqing-feng May 15, 2024
332e863
Merge branch 'huggingface:main' into pipeline
jiqing-feng May 15, 2024
30aec8a
fix style
jiqing-feng May 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,43 @@ where `extras` can be one or more of `neural-compressor`, `openvino`, `nncf`.

# Quick tour

## IPEX
### pipeline
Hugging Face pipelines provide a simple yet powerful abstraction to quickly set up inference. If you already have a pipeline from transformers, you can unlock the performance benefits of Optimum-Intel by just changing one line.
```diff
import torch
- from transformers.pipelines import pipeline
+ from optimum.intel.pipelines import pipeline

pipe = pipeline("text-generation", "gpt2", torch_dtype=torch.bfloat16)
pipe("Describe a real-world application of AI in sustainable energy.")
```

### generate
If you want control over advanced features like quantization and token selection strategies, we recommend using the generate() API. Just like with pipelines, switching from existing transformers code is super simple.
```diff
import torch
from transformers import AutoTokenizer, AutoConfig
- from transformers import AutoModelForCausalLM
+ from optimum.intel.generation.modeling import TSModelForCausalLM

config = AutoConfig.from_pretrained("gpt2")
model = TSModelForCausalLM.from_pretrained(
"gpt2",
config=config,
torch_dtype=torch.bfloat16,
export=True,
)
tokenizer = AutoTokenizer.from_pretrained("gpt2")
input_sentence = ["Answer the following yes/no question by reasoning step-by-step please. Can you write a whole Haiku in a single tweet?"]
model_inputs = tokenizer(input_sentence, return_tensors="pt")
generation_kwargs = dict(max_new_tokens=32, do_sample=False, num_beams=4, num_beam_groups=1, no_repeat_ngram_size=2, use_cache=True)

generated_ids = model.generate(**model_inputs, **generation_kwargs)
output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(output)
```

## Neural Compressor

Dynamic quantization can be used through the Optimum command-line interface:
Expand Down
Loading