Skip to content

Commit 3bb284c

Browse files
authored
Release data update (#1733)
Signed-off-by: chensuyue <suyue.chen@intel.com>
1 parent 4bdaf93 commit 3bb284c

File tree

2 files changed

+862
-1417
lines changed

2 files changed

+862
-1417
lines changed

docs/source/llm_recipes.md

+235-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ This document aims to publish the specific recipes we achieved for the popular L
1010
> - The quantization algorithms provide by [Intel® Neural Compressor](https://github.com/intel/neural-compressor) and the evaluate functions provide by [Intel® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers).
1111
> - The model list are continuing update, please expect to find more LLMs in the future.
1212
13-
## IPEX key models
13+
## Large Language Models Recipes
1414

1515
| Models | SQ INT8 | WOQ INT8 | WOQ INT4 |
1616
| :-----------------------------: | :-----: | :------: | :------: |
@@ -38,3 +38,237 @@ This document aims to publish the specific recipes we achieved for the popular L
3838
>
3939
> - This model list comes from [IPEX](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/llm.html).
4040
> - The WIP recipes will be published soon.
41+
42+
## Large Language Models Accuracy
43+
<table>
44+
<thead>
45+
<tr>
46+
<th rowspan="3">Model</th>
47+
<th colspan="9">lambada_openai</th>
48+
</tr>
49+
<tr>
50+
<th>FP32</th>
51+
<th colspan="2">SQ INT8</th>
52+
<th colspan="2">WOQ INT8</th>
53+
<th colspan="2">WOQ INT4 GPTQ</th>
54+
<th colspan="2">WOQ INT4 AutoRound</th>
55+
</tr>
56+
<tr>
57+
<th>ACC</th>
58+
<th>ACC</th>
59+
<th>Ratio</th>
60+
<th>ACC</th>
61+
<th>Ratio</th>
62+
<th>ACC</th>
63+
<th>Ratio</th>
64+
<th>ACC</th>
65+
<th>Ratio</th>
66+
</tr>
67+
</thead>
68+
<tbody>
69+
<tr>
70+
<td>baichuan-inc/Baichuan-13B-Chat</td>
71+
<td>67.57%</td>
72+
<td>68.23%</td>
73+
<td>1.0098</td>
74+
<td>67.57%</td>
75+
<td>1.0000</td>
76+
<td>67.84%</td>
77+
<td>1.0040</td>
78+
<td>NA</td>
79+
<td>NA</td>
80+
</tr>
81+
<tr>
82+
<td>baichuan-inc/Baichuan2-13B-Chat</td>
83+
<td>71.51%</td>
84+
<td>70.89%</td>
85+
<td>0.9913</td>
86+
<td>71.53%</td>
87+
<td>1.0003</td>
88+
<td>71.76%</td>
89+
<td>1.0035</td>
90+
<td>NA</td>
91+
<td>NA</td>
92+
</tr>
93+
<tr>
94+
<td>baichuan-inc/Baichuan2-7B-Chat</td>
95+
<td>67.67%</td>
96+
<td>67.96%</td>
97+
<td>1.0043</td>
98+
<td>67.59%</td>
99+
<td>0.9988</td>
100+
<td>67.24%</td>
101+
<td>0.9936</td>
102+
<td>67.42%</td>
103+
<td>0.9963</td>
104+
</tr>
105+
<tr>
106+
<td>bigscience/bloom-1b7</td>
107+
<td>46.34%</td>
108+
<td>47.99%</td>
109+
<td>1.0356</td>
110+
<td>46.38%</td>
111+
<td>1.0009</td>
112+
<td>46.19%</td>
113+
<td>0.9968</td>
114+
<td>NA</td>
115+
<td>NA</td>
116+
</tr>
117+
<tr>
118+
<td>databricks/dolly-v2-12b</td>
119+
<td>64.35%</td>
120+
<td>NA</td>
121+
<td>NA</td>
122+
<td>64.10%</td>
123+
<td>0.9961</td>
124+
<td>NA</td>
125+
<td>NA</td>
126+
<td>NA</td>
127+
<td>NA</td>
128+
</tr>
129+
<tr>
130+
<td>EleutherAI/gpt-j-6b</td>
131+
<td>68.31%</td>
132+
<td>68.33%</td>
133+
<td>1.0003</td>
134+
<td>68.23%</td>
135+
<td>0.9988</td>
136+
<td>68.79%</td>
137+
<td>1.0070</td>
138+
<td>68.43%</td>
139+
<td>1.0018</td>
140+
</tr>
141+
<tr>
142+
<td>EleutherAI/gpt-neox-20b</td>
143+
<td>72.33%</td>
144+
<td>NA</td>
145+
<td>NA</td>
146+
<td>72.25%</td>
147+
<td>0.9989</td>
148+
<td>71.96%</td>
149+
<td>0.9949</td>
150+
<td>NA</td>
151+
<td>NA</td>
152+
</tr>
153+
<tr>
154+
<td>facebook/opt-1.3b</td>
155+
<td>57.89%</td>
156+
<td>57.54%</td>
157+
<td>0.9940</td>
158+
<td>58.08%</td>
159+
<td>1.0033</td>
160+
<td>58.57%</td>
161+
<td>1.0117</td>
162+
<td>NA</td>
163+
<td>NA</td>
164+
</tr>
165+
<tr>
166+
<td>facebook/opt-30b</td>
167+
<td>71.49%</td>
168+
<td>71.51%</td>
169+
<td>1.0003</td>
170+
<td>71.51%</td>
171+
<td>1.0003</td>
172+
<td>71.82%</td>
173+
<td>1.0046</td>
174+
<td>72.11%</td>
175+
<td>1.0087</td>
176+
</tr>
177+
<tr>
178+
<td>meta-llama/Llama-2-13b-hf</td>
179+
<td>76.77%</td>
180+
<td>76.25%</td>
181+
<td>0.9932</td>
182+
<td>76.75%</td>
183+
<td>0.9997</td>
184+
<td>77.43%</td>
185+
<td>1.0086</td>
186+
<td>76.75%</td>
187+
<td>0.9997</td>
188+
</tr>
189+
<tr>
190+
<td>meta-llama/Llama-2-70b-hf</td>
191+
<td>79.64%</td>
192+
<td>79.55%</td>
193+
<td>0.9989</td>
194+
<td>79.57%</td>
195+
<td>0.9991</td>
196+
<td>80.09%</td>
197+
<td>1.0057</td>
198+
<td>79.97%</td>
199+
<td>1.0041</td>
200+
</tr>
201+
<tr>
202+
<td>meta-llama/Llama-2-7b-hf</td>
203+
<td>73.92%</td>
204+
<td>73.45%</td>
205+
<td>0.9936</td>
206+
<td>73.96%</td>
207+
<td>1.0005</td>
208+
<td>73.45%</td>
209+
<td>0.9936</td>
210+
<td>73.49%</td>
211+
<td>0.9942</td>
212+
</tr>
213+
<tr>
214+
<td>mistralai/Mistral-7B-v0.1</td>
215+
<td>75.90%</td>
216+
<td>NA</td>
217+
<td>NA</td>
218+
<td>75.80%</td>
219+
<td>0.9987</td>
220+
<td>76.13%</td>
221+
<td>1.0030</td>
222+
<td>75.61%</td>
223+
<td>0.9962</td>
224+
</tr>
225+
<tr>
226+
<td>THUDM/chatglm2-6b</td>
227+
<td>53.23%</td>
228+
<td>NA</td>
229+
<td>NA</td>
230+
<td>53.19%</td>
231+
<td>0.9992</td>
232+
<td>52.77%</td>
233+
<td>0.9914</td>
234+
<td>53.35%</td>
235+
<td>1.0023</td>
236+
</tr>
237+
<tr>
238+
<td>THUDM/chatglm3-6b</td>
239+
<td>59.09%</td>
240+
<td>NA</td>
241+
<td>NA</td>
242+
<td>59.01%</td>
243+
<td>0.9986</td>
244+
<td>NA</td>
245+
<td>NA</td>
246+
<td>58.61%</td>
247+
<td>0.9919</td>
248+
</tr>
249+
<tr>
250+
<td>tiiuae/falcon-40b</td>
251+
<td>77.22%</td>
252+
<td>77.04%</td>
253+
<td>0.9977</td>
254+
<td>77.22%</td>
255+
<td>1.0000</td>
256+
<td>77.94%</td>
257+
<td>1.0093</td>
258+
<td>78.79%</td>
259+
<td>1.0203</td>
260+
</tr>
261+
<tr>
262+
<td>tiiuae/falcon-7b</td>
263+
<td>74.67%</td>
264+
<td>76.44%</td>
265+
<td>1.0237</td>
266+
<td>74.77%</td>
267+
<td>1.0013</td>
268+
<td>75.00%</td>
269+
<td>1.0044</td>
270+
<td>NA</td>
271+
<td>NA</td>
272+
</tr>
273+
</tbody>
274+
</table>

0 commit comments

Comments
 (0)