You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: optimum/exporters/openvino/convert.py
+8-20
Original file line number
Diff line number
Diff line change
@@ -120,11 +120,8 @@ def export(
120
120
device (`str`, *optional*, defaults to `cpu`):
121
121
The device on which the model will be exported. Either `cpu` or `cuda`. Only PyTorch is supported for
122
122
export on CUDA devices.
123
-
compression_option (`Optional[str]`, defaults to `None`):
124
-
The weight compression option, e.g. `f16` stands for float16 weights, `i8` - INT8 weights, `int4_sym_g128` - INT4 symmetric weights w/ group size 128, `int4_asym_g128` - as previous but asymmetric w/ zero-point,
125
-
`int4_sym_g64` - INT4 symmetric weights w/ group size 64, "int4_asym_g64" - as previous but asymmetric w/ zero-point.
126
-
compression_ratio (`Optional[float]`, defaults to `None`):
127
-
Compression ratio between primary and backup precision (only relevant to INT4).
123
+
ov_config (`OVConfig`, *optional*):
124
+
The configuration containing the parameters related to quantization.
128
125
input_shapes (`Optional[Dict]`, defaults to `None`):
129
126
If specified, allows to use specific shapes for the example input provided to the exporter.
130
127
stateful (`bool`, defaults to `True`):
@@ -233,11 +230,8 @@ def export_pytorch_via_onnx(
233
230
If specified, allows to use specific shapes for the example input provided to the exporter.
234
231
model_kwargs (optional[Dict[str, Any]], defaults to `None`):
235
232
Additional kwargs for model export.
236
-
compression_option (`Optional[str]`, defaults to `None`):
237
-
The weight compression option, e.g. `f16` stands for float16 weights, `i8` - INT8 weights, `int4_sym_g128` - INT4 symmetric weights w/ group size 128, `int4_asym_g128` - as previous but asymmetric w/ zero-point,
238
-
`int4_sym_g64` - INT4 symmetric weights w/ group size 64, "int4_asym_g64" - as previous but asymmetric w/ zero-point.
239
-
compression_ratio (`Optional[float]`, defaults to `None`):
240
-
Compression ratio between primary and backup precision (only relevant to INT4).
233
+
ov_config (`OVConfig`, *optional*):
234
+
The configuration containing the parameters related to quantization.
241
235
242
236
Returns:
243
237
`Tuple[List[str], List[str], bool]`: A tuple with an ordered list of the model's inputs, and the named inputs from
@@ -290,11 +284,8 @@ def export_pytorch(
290
284
If specified, allows to use specific shapes for the example input provided to the exporter.
291
285
model_kwargs (optional[Dict[str, Any]], defaults to `None`):
292
286
Additional kwargs for model export
293
-
compression_option (`Optional[str]`, defaults to `None`):
294
-
The weight compression option, e.g. `f16` stands for float16 weights, `i8` - INT8 weights, `int4_sym_g128` - INT4 symmetric weights w/ group size 128, `int4_asym_g128` - as previous but asymmetric w/ zero-point,
295
-
`int4_sym_g64` - INT4 symmetric weights w/ group size 64, "int4_asym_g64" - as previous but asymmetric w/ zero-point.
296
-
compression_ratio (`Optional[float]`, defaults to `None`):
297
-
Compression ratio between primary and backup precision (only relevant to INT4).
287
+
ov_config (`OVConfig`, *optional*):
288
+
The configuration containing the parameters related to quantization.
298
289
stateful (`bool`, defaults to `False`):
299
290
Produce stateful model where all kv-cache inputs and outputs are hidden in the model and are not exposed as model inputs and outputs. Applicable only for decoder models.
300
291
@@ -452,11 +443,8 @@ def export_models(
452
443
export on CUDA devices.
453
444
input_shapes (Optional[Dict], optional, Defaults to None):
454
445
If specified, allows to use specific shapes for the example input provided to the exporter.
455
-
compression_option (`Optional[str]`, defaults to `None`):
456
-
The weight compression option, e.g. `f16` stands for float16 weights, `i8` - INT8 weights, `int4_sym_g128` - INT4 symmetric weights w/ group size 128, `int4_asym_g128` - as previous but asymmetric w/ zero-point,
457
-
`int4_sym_g64` - INT4 symmetric weights w/ group size 64, "int4_asym_g64" - as previous but asymmetric w/ zero-point.
458
-
compression_ratio (`Optional[int]`, defaults to `None`):
459
-
Compression ratio between primary and backup precision (only relevant to INT4).
446
+
ov_config (`OVConfig`, *optional*):
447
+
The configuration containing the parameters related to quantization.
0 commit comments