Update docs/source/inference.mdx

echarlaix · helena-intel · web-flow · commit afc23d0e0e20 · 2024-03-13T12:33:30.000+01:00
Co-authored-by: Helena Kloosterman &lt;helena.kloosterman@intel.com&gt;
diff --git a/docs/source/inference.mdx b/docs/source/inference.mdx
@@ -99,7 +99,7 @@ tokenizer.save_pretrained(save_directory)
 
 ### Weight-only quantization
 
-You can also apply fp16, 8-bit or 4-bit weight quantization on the linear and embedding layers when exporting your model with the CLI by setting `--weight-format` to respectively `fp16`, `int8` or `int4`:
+You can also apply fp16, 8-bit or 4-bit weight compression on the linear and embedding layers when exporting your model with the CLI by setting `--weight-format` to respectively `fp16`, `int8` or `int4`:
 
 ```bash
 optimum-cli export openvino --model gpt2 --weight-format int8 ov_model