Skip to content

Commit 697cd06

Browse files
committed
fix
1 parent 959b113 commit 697cd06

File tree

2 files changed

+9
-4
lines changed

2 files changed

+9
-4
lines changed

docs/source/inference.mdx

+4-2
Original file line numberDiff line numberDiff line change
@@ -122,9 +122,11 @@ from optimum.intel import OVModelForCausalLM
122122
model = OVModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
123123
```
124124

125-
> [!NOTE]
126-
> `load_in_8bit` is enabled by default for the models larger than 1 billion parameters.
125+
<Tip warning={true}>
127126

127+
`load_in_8bit` is enabled by default for the models larger than 1 billion parameters.
128+
129+
</Tip>
128130

129131
To apply quantization on both weights and activations, you can use the `OVQuantizer`, more information in the [documentation](https://huggingface.co/docs/optimum/main/en/intel/optimization_ov#optimization).
130132

docs/source/optimization_ov.mdx

+5-2
Original file line numberDiff line numberDiff line change
@@ -69,8 +69,11 @@ from optimum.intel import OVModelForCausalLM
6969
model = OVModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
7070
```
7171

72-
> [!NOTE]
73-
> `load_in_8bit` is enabled by default for the models larger than 1 billion parameters.
72+
<Tip warning={true}>
73+
74+
`load_in_8bit` is enabled by default for the models larger than 1 billion parameters.
75+
76+
</Tip>
7477

7578
For the 4-bit weight quantization you can use the `quantization_config` to specify the optimization parameters, for example:
7679

0 commit comments

Comments
 (0)