Skip to content

Commit afc23d0

Browse files
Update docs/source/inference.mdx
Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>
1 parent 027c370 commit afc23d0

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/source/inference.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ tokenizer.save_pretrained(save_directory)
9999

100100
### Weight-only quantization
101101

102-
You can also apply fp16, 8-bit or 4-bit weight quantization on the linear and embedding layers when exporting your model with the CLI by setting `--weight-format` to respectively `fp16`, `int8` or `int4`:
102+
You can also apply fp16, 8-bit or 4-bit weight compression on the linear and embedding layers when exporting your model with the CLI by setting `--weight-format` to respectively `fp16`, `int8` or `int4`:
103103

104104
```bash
105105
optimum-cli export openvino --model gpt2 --weight-format int8 ov_model

0 commit comments

Comments
 (0)