Skip to content

Commit f843794

Browse files
committed
[Eval] Add eval test with opencompass.
1 parent 819eccc commit f843794

File tree

1 file changed

+86
-0
lines changed

1 file changed

+86
-0
lines changed

examples/opencompass/README.md

+86
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# xFT Accuracy Evalution with opencompass
2+
OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets. more details information can refer to [https://opencompass.org.cn/](https://opencompass.org.cn/)
3+
4+
## Installation
5+
Below are the steps for quick installation and datasets preparation.
6+
7+
### Environment Setup
8+
``` bash
9+
# setup steps is refer to https://opencompass.org.cn/doc
10+
$ conda create -n opencompass python=3.10 pytorch torchvision torchaudio cpuonly -c pytorch -y
11+
$ conda activate opencompass
12+
13+
$ git clone -b intel/xft https://github.com/marvin-Yu/opencompass.git && cd opencompass
14+
$ pip install -e .
15+
```
16+
17+
### Data Preparation
18+
``` bash
19+
# download core dataset
20+
$ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
21+
22+
$ unzip OpenCompassData-core-20240207.zip
23+
24+
# # download full dataset
25+
# $ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-complete-20240207.zip
26+
# $ unzip OpenCompassData-complete-20240207.zip
27+
# $ cd ./data
28+
# $ find . -name "*.zip" -exec unzip "{}" \;
29+
```
30+
31+
### Model Preparation
32+
Download model weights to the `/data/models` directory (path configuration is not supported temporarily; you can modify the configuration file or create symbolic links). For example, to test the model `chatglm2_6b`:
33+
```bash
34+
/data/models/
35+
├── chatglm2-6b-hf
36+
├── chatglm2-6b-hf-xft
37+
├── ...
38+
```
39+
For exporting xFT models, please refer to [xFT Models Preparation](https://github.com/intel/xFasterTransformer?tab=readme-ov-file#models-preparation).
40+
41+
### xFT Evaluation
42+
``` bash
43+
# list all xFT support models
44+
$ python tools/list_configs.py xft
45+
# +------------------------+----------------------------------------------+
46+
# | Model | Config Path |
47+
# |------------------------+----------------------------------------------|
48+
# | xft_llama2_13b_chat | configs/models/xft/xft_llama2_13b_chat.py |
49+
# | xft_llama2_70b_chat | configs/models/xft/xft_llama2_70b_chat.py |
50+
# | xft_llama2_7b_chat | configs/models/xft/xft_llama2_7b_chat.py |
51+
# | xft_chatglm2_6b | configs/models/xft/xft_chatglm2_6b.py |
52+
# | xft_chatglm3_6b | configs/models/xft/xft_chatglm3_6b.py |
53+
# | xft_chatglm_6b | configs/models/xft/xft_chatglm_6b.py |
54+
# | xft_gemma_2b_it | configs/models/xft/xft_gemma_2b_it.py |
55+
# | xft_gemma_7b_it | configs/models/xft/xft_gemma_7b_it.py |
56+
# | ............... | ..................................... |
57+
# +------------------------+----------------------------------------------+
58+
59+
# list dataset than you want.
60+
$ python tools/list_configs.py ceval
61+
# +--------------------------------+------------------------------------------------------------------+
62+
# | Dataset | Config Path |
63+
# |--------------------------------+------------------------------------------------------------------|
64+
# | ceval_gen | configs/datasets/ceval/ceval_gen.py |
65+
# | ceval_gen_5f30c7 | configs/datasets/ceval/ceval_gen_5f30c7.py |
66+
# | ceval_ppl | configs/datasets/ceval/ceval_ppl.py |
67+
# | ceval_ppl_93e5ce | configs/datasets/ceval/ceval_ppl_93e5ce.py |
68+
# | ............... | ............................................. |
69+
# +--------------------------------+------------------------------------------------------------------+
70+
71+
# run eval test
72+
$ python run.py --models xft_chatglm2_6b --datasets ceval_gen
73+
# 20240416_100621
74+
# tabulate format
75+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
76+
# dataset version metric mode xft_chatglm2_6b-bf16 xft_chatglm2_6b-xft-bf16
77+
# ---------------------------------------------- --------- ------------- ------ --------------------- -------------------------
78+
# ceval-computer_network db9ce2 accuracy gen 47.37 47.37
79+
# ceval-operating_system 1c2571 accuracy gen xx.xx xx.xx
80+
# ceval-computer_architecture a74dad accuracy gen xx.xx xx.xx
81+
# ceval-college_programming 4ca32a accuracy gen xx.xx xx.xx
82+
# ceval-college_physics 963fa8 accuracy gen xx.xx xx.xx
83+
# ceval-college_chemistry e78857 accuracy gen xx.xx xx.xx
84+
# ceval-advanced_mathematics ce03e2 accuracy gen xx.xx xx.xx
85+
# ...
86+
```

0 commit comments

Comments
 (0)