|
| 1 | +# xFT Accuracy Evalution with opencompass |
| 2 | +OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets. more details information can refer to [https://opencompass.org.cn/](https://opencompass.org.cn/) |
| 3 | + |
| 4 | +## Installation |
| 5 | +Below are the steps for quick installation and datasets preparation. |
| 6 | + |
| 7 | +### Environment Setup |
| 8 | +``` bash |
| 9 | +# setup steps is refer to https://opencompass.org.cn/doc |
| 10 | +$ conda create -n opencompass python=3.10 pytorch torchvision torchaudio cpuonly -c pytorch -y |
| 11 | +$ conda activate opencompass |
| 12 | + |
| 13 | +$ git clone -b intel/xft https://github.com/marvin-Yu/opencompass.git && cd opencompass |
| 14 | +$ pip install -e . |
| 15 | +``` |
| 16 | + |
| 17 | +### Data Preparation |
| 18 | +``` bash |
| 19 | +# download core dataset |
| 20 | +$ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip |
| 21 | + |
| 22 | +$ unzip OpenCompassData-core-20240207.zip |
| 23 | + |
| 24 | +# # download full dataset |
| 25 | +# $ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-complete-20240207.zip |
| 26 | +# $ unzip OpenCompassData-complete-20240207.zip |
| 27 | +# $ cd ./data |
| 28 | +# $ find . -name "*.zip" -exec unzip "{}" \; |
| 29 | +``` |
| 30 | + |
| 31 | +### Model Preparation |
| 32 | +Download model weights to the `/data/models` directory (path configuration is not supported temporarily; you can modify the configuration file or create symbolic links). For example, to test the model `chatglm2_6b`: |
| 33 | +```bash |
| 34 | +/data/models/ |
| 35 | +├── chatglm2-6b-hf |
| 36 | +├── chatglm2-6b-hf-xft |
| 37 | +├── ... |
| 38 | +``` |
| 39 | +For exporting xFT models, please refer to [xFT Models Preparation](https://github.com/intel/xFasterTransformer?tab=readme-ov-file#models-preparation). |
| 40 | + |
| 41 | +### xFT Evaluation |
| 42 | +``` bash |
| 43 | +# list all xFT support models |
| 44 | +$ python tools/list_configs.py xft |
| 45 | +# +------------------------+----------------------------------------------+ |
| 46 | +# | Model | Config Path | |
| 47 | +# |------------------------+----------------------------------------------| |
| 48 | +# | xft_llama2_13b_chat | configs/models/xft/xft_llama2_13b_chat.py | |
| 49 | +# | xft_llama2_70b_chat | configs/models/xft/xft_llama2_70b_chat.py | |
| 50 | +# | xft_llama2_7b_chat | configs/models/xft/xft_llama2_7b_chat.py | |
| 51 | +# | xft_chatglm2_6b | configs/models/xft/xft_chatglm2_6b.py | |
| 52 | +# | xft_chatglm3_6b | configs/models/xft/xft_chatglm3_6b.py | |
| 53 | +# | xft_chatglm_6b | configs/models/xft/xft_chatglm_6b.py | |
| 54 | +# | xft_gemma_2b_it | configs/models/xft/xft_gemma_2b_it.py | |
| 55 | +# | xft_gemma_7b_it | configs/models/xft/xft_gemma_7b_it.py | |
| 56 | +# | ............... | ..................................... | |
| 57 | +# +------------------------+----------------------------------------------+ |
| 58 | + |
| 59 | +# list dataset than you want. |
| 60 | +$ python tools/list_configs.py ceval |
| 61 | +# +--------------------------------+------------------------------------------------------------------+ |
| 62 | +# | Dataset | Config Path | |
| 63 | +# |--------------------------------+------------------------------------------------------------------| |
| 64 | +# | ceval_gen | configs/datasets/ceval/ceval_gen.py | |
| 65 | +# | ceval_gen_5f30c7 | configs/datasets/ceval/ceval_gen_5f30c7.py | |
| 66 | +# | ceval_ppl | configs/datasets/ceval/ceval_ppl.py | |
| 67 | +# | ceval_ppl_93e5ce | configs/datasets/ceval/ceval_ppl_93e5ce.py | |
| 68 | +# | ............... | ............................................. | |
| 69 | +# +--------------------------------+------------------------------------------------------------------+ |
| 70 | + |
| 71 | +# run eval test |
| 72 | +$ python run.py --models xft_chatglm2_6b --datasets ceval_gen |
| 73 | +# 20240416_100621 |
| 74 | +# tabulate format |
| 75 | +# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 76 | +# dataset version metric mode xft_chatglm2_6b-bf16 xft_chatglm2_6b-xft-bf16 |
| 77 | +# ---------------------------------------------- --------- ------------- ------ --------------------- ------------------------- |
| 78 | +# ceval-computer_network db9ce2 accuracy gen 47.37 47.37 |
| 79 | +# ceval-operating_system 1c2571 accuracy gen xx.xx xx.xx |
| 80 | +# ceval-computer_architecture a74dad accuracy gen xx.xx xx.xx |
| 81 | +# ceval-college_programming 4ca32a accuracy gen xx.xx xx.xx |
| 82 | +# ceval-college_physics 963fa8 accuracy gen xx.xx xx.xx |
| 83 | +# ceval-college_chemistry e78857 accuracy gen xx.xx xx.xx |
| 84 | +# ceval-advanced_mathematics ce03e2 accuracy gen xx.xx xx.xx |
| 85 | +# ... |
| 86 | +``` |
0 commit comments