You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Fp8 implementation
* All datasets support
* Added test
* Update test
* Correctness
* Correctness
* Update docs/source/openvino/export.mdx
Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>
* Change test model
* Apply comments
---------
Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>
@@ -166,7 +165,7 @@ Models larger than 1 billion parameters are exported to the OpenVINO format with
166
165
</Tip>
167
166
168
167
169
-
Besides weight-only quantization, you can also apply full model quantization including activations by setting `--quant-mode` to `int8`. This will quantize both weights and activations of Linear, Convolutional and some other layers to int8. Currently this is only supported for speech-to-text models. Please see example below.
168
+
Besides weight-only quantization, you can also apply full model quantization including activations by setting `--quant-mode` to preffered precision. This will quantize both weights and activations of Linear, Convolutional and some other layers to selected mode. Please see example below.
0 commit comments