Skip to content

Commit 83dbe2f

Browse files
committed
remove
1 parent 52b11db commit 83dbe2f

File tree

1 file changed

+0
-8
lines changed

1 file changed

+0
-8
lines changed

docs/source/inference.mdx

-8
Original file line numberDiff line numberDiff line change
@@ -108,14 +108,6 @@ optimum-cli export openvino --model gpt2 --weight-format int8 ov_model
108108
This type of optimization allows to reduce the memory footprint and inference latency.
109109

110110

111-
| `--weight-format` |
112-
|-------------------|
113-
| `fp32` |
114-
| `fp16` |
115-
| `int8` |
116-
| `int4` |
117-
118-
119111
By default the quantization scheme will be [assymmetric](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Quantization.md#asymmetric-quantization), to make it [symmetric](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Quantization.md#symmetric-quantization) you can add `--sym`.
120112

121113
For INT4 quantization you can also specify the following arguments :

0 commit comments

Comments
 (0)