Skip to content

Commit 12438c4

Browse files
Support Transformers 4.43 (huggingface#856)
* install from pr * updates * fix * update TRANSFORMERS_MAX_VERSION * fix sdpa in training * fix whisper * fix * whisper calibration checks * fix OVTrainerTextClassificationTrainingTest's expected fake quantize * fix OVCLIExportTestCase's expected_int4 * update min ci transformers version to 4.37 * fix OVQuantizerTest's expected fake quantize * reorder_cache * fix expected compressed matmuls * fix test_exporters_cli_int4_with_local_model_and_default_config * fix qwen custom modeling test * fix failing ipex tests * fix ipex * fix the last ipex failing test_compare_with_and_without_past_key_values * use minimal prepare_inputs_for_generation in OVModelForSpeechSeq2Seq * keeping compatibility with transformers 4.36 * keep support of whisper using WhisperGenerationMixin.generate a,d dummy model fix * trigger * fix * device property * standardize .device and ._device attributes/properties * fix * fix * revert Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * use falcon * torch.device property always cpu * style * resolve conflicts * decoder_attention_mask for older versions * optimum main * limit inc transformers version * fix pipeline missing dtype * add dtype for seq to seq models * pass phi beam search test and skip internlm2 * fix for internlm2 --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
1 parent 4cf898b commit 12438c4

18 files changed

+292
-561
lines changed

.github/workflows/test_ipex.yml

+18-18
Original file line numberDiff line numberDiff line change
@@ -22,27 +22,27 @@ jobs:
2222
fail-fast: false
2323
matrix:
2424
python-version: [3.9]
25-
transformers-version: [4.39.0, 4.42.3]
26-
ipex-version: [2.2.0, 2.3.*]
25+
transformers-version: ["4.39.0", "4.43.*"]
26+
ipex-version: ["2.2.0", "2.3.*"]
2727
include:
2828
- python-version: 3.8
2929
transformers-version: 4.39.0
3030
ipex-version: 2.2.0
3131

3232
steps:
33-
- uses: actions/checkout@v2
34-
- name: Setup Python ${{ matrix.python-version }}
35-
uses: actions/setup-python@v2
36-
with:
37-
python-version: ${{ matrix.python-version }}
38-
- name: Install dependencies
39-
run: |
40-
python -m pip install --upgrade pip
41-
pip install torch==${{ matrix.ipex-version }} --extra-index-url https://download.pytorch.org/whl/cpu
42-
pip install intel_extension_for_pytorch==${{ matrix.ipex-version }}
43-
pip install Pillow parameterized
44-
pip install transformers[testing]==${{ matrix.transformers-version }}
45-
pip install .[ipex]
46-
- name: Test with Pytest
47-
run: |
48-
pytest tests/ipex/
33+
- uses: actions/checkout@v2
34+
- name: Setup Python ${{ matrix.python-version }}
35+
uses: actions/setup-python@v2
36+
with:
37+
python-version: ${{ matrix.python-version }}
38+
- name: Install dependencies
39+
run: |
40+
python -m pip install --upgrade pip
41+
pip install torch==${{ matrix.ipex-version }} --extra-index-url https://download.pytorch.org/whl/cpu
42+
pip install intel_extension_for_pytorch==${{ matrix.ipex-version }}
43+
pip install Pillow parameterized
44+
pip install transformers[testing]==${{ matrix.transformers-version }}
45+
pip install .[ipex]
46+
- name: Test with Pytest
47+
run: |
48+
pytest tests/ipex/

.github/workflows/test_openvino.yml

+29-28
Original file line numberDiff line numberDiff line change
@@ -21,36 +21,37 @@ jobs:
2121
fail-fast: false
2222
matrix:
2323
python-version: ["3.8", "3.12"]
24-
transformers-version: ["4.36.0", "4.42.*"]
24+
transformers-version: ["4.36.0", "4.43.*"]
2525
os: [ubuntu-latest]
2626

2727
runs-on: ${{ matrix.os }}
2828
steps:
29-
- uses: actions/checkout@v4
30-
- name: Setup Python ${{ matrix.python-version }}
31-
uses: actions/setup-python@v5
32-
with:
33-
python-version: ${{ matrix.python-version }}
34-
- name: Install dependencies
35-
run: |
36-
python -m pip install --upgrade pip
37-
# install PyTorch CPU version to avoid installing CUDA packages on GitHub runner without GPU
38-
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
39-
pip install transformers==${{ matrix.transformers-version }}
40-
pip install .[openvino,openvino-tokenizers,tests,diffusers] onnxruntime
41-
- name: Test with Pytest
42-
env:
43-
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
44-
run: |
45-
pytest tests/openvino/ --ignore tests/openvino/test_modeling_basic.py --durations=0
46-
- name: Test basic
47-
run: |
48-
pip uninstall -y nncf
49-
pytest tests/openvino/test_modeling_basic.py
50-
- name: Test openvino-nightly
51-
run: |
52-
pip uninstall -y openvino
53-
pip install openvino-nightly
54-
python -c "from optimum.intel import OVModelForCausalLM; OVModelForCausalLM.from_pretrained('hf-internal-testing/tiny-random-gpt2', export=True, compile=False)"
55-
optimum-cli export openvino -m hf-internal-testing/tiny-random-gpt2 gpt2-ov
29+
- uses: actions/checkout@v4
30+
- name: Setup Python ${{ matrix.python-version }}
31+
uses: actions/setup-python@v5
32+
with:
33+
python-version: ${{ matrix.python-version }}
5634

35+
- name: Install dependencies
36+
run: |
37+
python -m pip install --upgrade pip
38+
# install PyTorch CPU version to avoid installing CUDA packages on GitHub runner without GPU
39+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
40+
pip install .[openvino,openvino-tokenizers,tests,diffusers] onnxruntime
41+
pip install transformers==${{ matrix.transformers-version }}
42+
43+
- name: Test with Pytest
44+
env:
45+
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
46+
run: |
47+
pytest tests/openvino/ --ignore tests/openvino/test_modeling_basic.py --durations=0
48+
- name: Test basic
49+
run: |
50+
pip uninstall -y nncf
51+
pytest tests/openvino/test_modeling_basic.py
52+
- name: Test openvino-nightly
53+
run: |
54+
pip uninstall -y openvino
55+
pip install openvino-nightly
56+
python -c "from optimum.intel import OVModelForCausalLM; OVModelForCausalLM.from_pretrained('hf-internal-testing/tiny-random-gpt2', export=True, compile=False)"
57+
optimum-cli export openvino -m hf-internal-testing/tiny-random-gpt2 gpt2-ov

.github/workflows/test_openvino_basic.yml

+33-32
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name: OpenVINO - Basic Test
33
on:
44
workflow_dispatch:
55
schedule:
6-
- cron: '41 1 * * *' # run every day at 1:41
6+
- cron: "41 1 * * *" # run every day at 1:41
77
push:
88
branches:
99
- v*-release
@@ -24,40 +24,41 @@ jobs:
2424
# This also ensures that the test fails if dependencies break for Python 3.7
2525
python-version: ["3.8", "3.12"]
2626
os: ["ubuntu-22.04", "windows-latest"]
27-
transformers-version: ["4.42.*"]
27+
transformers-version: ["4.43.*"]
2828
include:
29-
- transformers-version: "4.36.0"
30-
python-version: "3.12"
29+
- python-version: "3.12"
3130
os: "ubuntu-22.04"
31+
transformers-version: "4.36.0"
3232

3333
runs-on: ${{ matrix.os }}
3434

3535
steps:
36-
- uses: actions/checkout@v4
37-
- name: Setup Python ${{ matrix.python-version }}
38-
uses: actions/setup-python@v5
39-
with:
40-
python-version: ${{ matrix.python-version }}
41-
42-
- name: Install dependencies
43-
run: |
44-
# Install openvino manually to prevent dependency conflicts when .[openvino] pins
45-
# optimum or transformers to a specific version
46-
# Install PyTorch CPU to prevent unnecessary downloading/installing of CUDA packages
47-
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
48-
pip install transformers==${{ matrix.transformers-version }}
49-
pip install .[tests] openvino
50-
51-
- name: Pip freeze
52-
run: pip freeze
53-
54-
- name: Test with Pytest
55-
run: |
56-
pytest tests/openvino/test_modeling_basic.py
57-
58-
- name: Slow tests
59-
run: |
60-
pip install nncf
61-
pytest tests/openvino -s -m "run_slow" --durations=0
62-
env:
63-
RUN_SLOW: 1
36+
- uses: actions/checkout@v4
37+
- name: Setup Python ${{ matrix.python-version }}
38+
uses: actions/setup-python@v5
39+
with:
40+
python-version: ${{ matrix.python-version }}
41+
42+
- name: Install dependencies
43+
run: |
44+
python -m pip install --upgrade pip
45+
# Install PyTorch CPU to prevent unnecessary downloading/installing of CUDA packages
46+
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
47+
# Install openvino manually to prevent dependency conflicts when .[openvino] pins
48+
# optimum or transformers to a specific version
49+
pip install .[tests] openvino
50+
pip install transformers==${{ matrix.transformers-version }}
51+
52+
- name: Pip freeze
53+
run: pip freeze
54+
55+
- name: Test with Pytest
56+
run: |
57+
pytest tests/openvino/test_modeling_basic.py
58+
59+
- name: Slow tests
60+
run: |
61+
pip install nncf
62+
pytest tests/openvino -s -m "run_slow" --durations=0
63+
env:
64+
RUN_SLOW: 1

optimum/exporters/ipex/model_patcher.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434

3535
# Please also update in the setup.py and .github/workflows/test_ipex.yml if you change the transformers version
3636
_TRANSFORMERS_MIN_VERSION = "4.39.0"
37-
_TRANSFORMERS_MAX_VERSION = "4.42.3"
37+
_TRANSFORMERS_MAX_VERSION = "4.43.99"
3838

3939
_IPEX_EXPORTED_GENERATION_TASKS = ("text-generation",)
4040

optimum/intel/ipex/modeling_base.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -474,9 +474,11 @@ def __init__(
474474
self._reorder_cache = _ipex_reorder_cache
475475
else:
476476
# Check if _reorder_cache is a static method
477-
if isinstance(self.model_cls.__dict__["_reorder_cache"], staticmethod):
477+
if "_reorder_cache" in self.model_cls.__dict__ and isinstance(
478+
self.model_cls.__dict__["_reorder_cache"], staticmethod
479+
):
478480
self._reorder_cache = self.model_cls._reorder_cache
479-
else:
481+
elif "_reorder_cache" in self.model_cls.__dict__:
480482
self._reorder_cache = self.model_cls._reorder_cache.__get__(self)
481483

482484
if is_transformers_version(">=", "4.38.0") and model_type in {"llama", "phi", "persimmon", "mistral"}:

optimum/intel/openvino/modeling.py

-1
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,6 @@ def __init__(self, model: openvino.runtime.Model, config: transformers.Pretraine
129129
# Avoid warnings when creating a transformers pipeline
130130
AutoConfig.register(self.base_model_prefix, AutoConfig)
131131
self.auto_model_class.register(AutoConfig, self.__class__)
132-
self.device = torch.device("cpu")
133132

134133
def to(self, device: str):
135134
"""

optimum/intel/openvino/modeling_base.py

+34-1
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
from typing import Dict, Optional, Union
2121

2222
import openvino
23+
import torch
2324
from huggingface_hub import hf_hub_download
2425
from huggingface_hub.constants import HUGGINGFACE_HUB_CACHE
2526
from openvino import Core, convert_model
@@ -34,7 +35,7 @@
3435
from ...exporters.openvino import export, main_export
3536
from ..utils.import_utils import is_nncf_available
3637
from .configuration import OVConfig, OVDynamicQuantizationConfig, OVWeightQuantizationConfig
37-
from .utils import ONNX_WEIGHTS_NAME, OV_XML_FILE_NAME, _print_compiled_model_properties
38+
from .utils import ONNX_WEIGHTS_NAME, OV_TO_PT_TYPE, OV_XML_FILE_NAME, _print_compiled_model_properties
3839

3940

4041
core = Core()
@@ -77,16 +78,27 @@ def __init__(
7778
model = self._reshape(model, -1, -1, height, width)
7879

7980
input_names = {}
81+
input_dtypes = {}
8082
for idx, key in enumerate(model.inputs):
8183
names = tuple(key.get_names())
8284
input_names[next((name for name in names if "/" not in name), names[0])] = idx
85+
input_dtypes[
86+
next((name for name in names if "/" not in name), names[0])
87+
] = key.get_element_type().get_type_name()
8388
self.input_names = input_names
89+
self.input_dtypes = input_dtypes
8490

8591
output_names = {}
92+
output_dtypes = {}
8693
for idx, key in enumerate(model.outputs):
8794
names = tuple(key.get_names())
8895
output_names[next((name for name in names if "/" not in name), names[0])] = idx
96+
output_dtypes[
97+
next((name for name in names if "/" not in name), names[0])
98+
] = key.get_element_type().get_type_name()
99+
89100
self.output_names = output_names
101+
self.output_dtypes = output_dtypes
90102

91103
self.model = model
92104
self.request = None
@@ -103,6 +115,27 @@ def __init__(
103115
if enable_compilation:
104116
self.compile()
105117

118+
@property
119+
def device(self) -> torch.device:
120+
"""
121+
`torch.device`: The device on which the module is (for torch compatibility).
122+
"""
123+
return torch.device("cpu")
124+
125+
@property
126+
def dtype(self) -> Optional[torch.dtype]:
127+
for dtype in self.input_dtypes.values():
128+
torch_dtype = OV_TO_PT_TYPE.get(dtype)
129+
if torch_dtype.is_floating_point:
130+
return torch_dtype
131+
132+
for dtype in self.output_dtypes.values():
133+
torch_dtype = OV_TO_PT_TYPE.get(dtype)
134+
if torch_dtype.is_floating_point:
135+
return torch_dtype
136+
137+
return None
138+
106139
@staticmethod
107140
def load_model(
108141
file_name: Union[str, Path],

optimum/intel/openvino/modeling_base_seq2seq.py

+2
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,8 @@ def _reshape(self, model: openvino.runtime.Model, batch_size: int, sequence_leng
350350
shapes[inputs][0] = batch_size if not is_decoder else -1
351351
if inputs.get_any_name().startswith("past_key_values"):
352352
shapes[inputs][2] = -1
353+
elif inputs.get_any_name().startswith("cache_position"):
354+
shapes[inputs][0] = sequence_length
353355
elif is_decoder and not inputs.get_any_name().startswith("encoder"):
354356
shapes[inputs][1] = -1
355357
else:

optimum/intel/openvino/modeling_diffusion.py

+12-11
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
import numpy as np
2626
import openvino
2727
import PIL
28+
import torch
2829
from diffusers import (
2930
DDIMScheduler,
3031
LMSDiscreteScheduler,
@@ -422,10 +423,6 @@ def to(self, device: str):
422423

423424
return self
424425

425-
@property
426-
def device(self) -> str:
427-
return self._device.lower()
428-
429426
@property
430427
def height(self) -> int:
431428
height = self.unet.model.inputs[0].get_partial_shape()[2]
@@ -631,21 +628,25 @@ def _compile(self):
631628
if (
632629
"CACHE_DIR" not in self.ov_config.keys()
633630
and not str(self._model_dir).startswith(gettempdir())
634-
and "gpu" in self.device.lower()
631+
and "GPU" in self._device
635632
):
636633
self.ov_config["CACHE_DIR"] = os.path.join(self._model_dir, self._model_name, "model_cache")
637634

638-
logger.info(f"Compiling the {self._model_name} to {self.device} ...")
639-
self.request = core.compile_model(self.model, self.device, self.ov_config)
635+
logger.info(f"Compiling the {self._model_name} to {self._device} ...")
636+
self.request = core.compile_model(self.model, self._device, self.ov_config)
640637
# OPENVINO_LOG_LEVEL can be found in https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_AUTO_debugging.html
641638
if "OPENVINO_LOG_LEVEL" in os.environ and int(os.environ["OPENVINO_LOG_LEVEL"]) > 2:
642-
logger.info(f"{self.device} SUPPORTED_PROPERTIES:")
639+
logger.info(f"{self._device} SUPPORTED_PROPERTIES:")
643640
_print_compiled_model_properties(self.request)
644641

645642
@property
646-
def device(self):
643+
def _device(self) -> str:
647644
return self.parent_model._device
648645

646+
@property
647+
def device(self) -> torch.device:
648+
return self.parent_model.device
649+
649650

650651
class OVModelTextEncoder(OVModelPart):
651652
def __init__(
@@ -717,7 +718,7 @@ def __call__(self, latent_sample: np.ndarray):
717718
return list(outputs.values())
718719

719720
def _compile(self):
720-
if "GPU" in self.device:
721+
if "GPU" in self._device:
721722
self.ov_config.update({"INFERENCE_PRECISION_HINT": "f32"})
722723
super()._compile()
723724

@@ -738,7 +739,7 @@ def __call__(self, sample: np.ndarray):
738739
return list(outputs.values())
739740

740741
def _compile(self):
741-
if "GPU" in self.device:
742+
if "GPU" in self._device:
742743
self.ov_config.update({"INFERENCE_PRECISION_HINT": "f32"})
743744
super()._compile()
744745

0 commit comments

Comments
 (0)