|
| 1 | +# xFT Accuracy Evalution with opencompass |
| 2 | +OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets. more details information can refer to [https://opencompass.org.cn/](https://opencompass.org.cn/) |
| 3 | + |
| 4 | +## Installation |
| 5 | +Below are the steps for quick installation and datasets preparation. |
| 6 | + |
| 7 | +### Environment Setup |
| 8 | +``` bash |
| 9 | +# setup steps is refer to https://opencompass.org.cn/doc |
| 10 | +$ conda create -n opencompass python=3.10 pytorch torchvision torchaudio cpuonly -c pytorch -y |
| 11 | +$ conda activate opencompass |
| 12 | + |
| 13 | +$ git clone -b intel/xft https://github.com/marvin-Yu/opencompass.git && cd opencompass |
| 14 | +$ pip install -e . |
| 15 | +``` |
| 16 | + |
| 17 | +### Data Preparation |
| 18 | +``` bash |
| 19 | +# download core dataset |
| 20 | +$ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip |
| 21 | + |
| 22 | +$ unzip OpenCompassData-core-20240207.zip |
| 23 | + |
| 24 | +# # download full dataset |
| 25 | +# $ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-complete-20240207.zip |
| 26 | +# $ unzip OpenCompassData-complete-20240207.zip |
| 27 | +# $ cd ./data |
| 28 | +# $ find . -name "*.zip" -exec unzip "{}" \; |
| 29 | +``` |
| 30 | + |
| 31 | +### Model Preparation |
| 32 | +Download model weights to the `/data/models` directory (path configuration is not supported temporarily; you can modify the configuration file or create symbolic links). For example, to test the model `chatglm2_6b`: |
| 33 | +```bash |
| 34 | +/data/models/ |
| 35 | +├── chatglm2-6b-hf |
| 36 | +├── chatglm2-6b-hf-xft |
| 37 | +├── ... |
| 38 | +``` |
| 39 | +For exporting xFT models, please refer to [xFT Models Preparation](https://github.com/intel/xFasterTransformer?tab=readme-ov-file#models-preparation). |
| 40 | + |
| 41 | +### xFT Evaluation |
| 42 | +``` bash |
| 43 | +# list all xFT support models |
| 44 | +$ python tools/list_configs.py xft |
| 45 | +# +------------------------+----------------------------------------------+ |
| 46 | +# | Model | Config Path | |
| 47 | +# |------------------------+----------------------------------------------| |
| 48 | +# | xft_llama2_13b_chat | configs/models/xft/xft_llama2_13b_chat.py | |
| 49 | +# | xft_llama2_70b_chat | configs/models/xft/xft_llama2_70b_chat.py | |
| 50 | +# | xft_llama2_7b_chat | configs/models/xft/xft_llama2_7b_chat.py | |
| 51 | +# | xft_chatglm2_6b | configs/models/xft/xft_chatglm2_6b.py | |
| 52 | +# | xft_chatglm3_6b | configs/models/xft/xft_chatglm3_6b.py | |
| 53 | +# | xft_chatglm_6b | configs/models/xft/xft_chatglm_6b.py | |
| 54 | +# | xft_gemma_2b_it | configs/models/xft/xft_gemma_2b_it.py | |
| 55 | +# | xft_gemma_7b_it | configs/models/xft/xft_gemma_7b_it.py | |
| 56 | +# | ............... | ..................................... | |
| 57 | +# +------------------------+----------------------------------------------+ |
| 58 | + |
| 59 | +# list dataset than you want. |
| 60 | +$ python tools/list_configs.py ceval |
| 61 | +# +--------------------------------+------------------------------------------------------------------+ |
| 62 | +# | Dataset | Config Path | |
| 63 | +# |--------------------------------+------------------------------------------------------------------| |
| 64 | +# | ceval_gen | configs/datasets/ceval/ceval_gen.py | |
| 65 | +# | ceval_gen_5f30c7 | configs/datasets/ceval/ceval_gen_5f30c7.py | |
| 66 | +# | ceval_ppl | configs/datasets/ceval/ceval_ppl.py | |
| 67 | +# | ceval_ppl_93e5ce | configs/datasets/ceval/ceval_ppl_93e5ce.py | |
| 68 | +# | ............... | ............................................. | |
| 69 | +# +--------------------------------+------------------------------------------------------------------+ |
| 70 | + |
| 71 | +# run eval test |
| 72 | +$ python run.py --max-num-workers 1 --models xft_chatglm2_6b --datasets ceval_gen |
| 73 | +# 20240416_100621 |
| 74 | +# tabulate format |
| 75 | +# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 76 | +# dataset version metric mode xft_chatglm2_6b-bf16 xft_chatglm2_6b-xft-bf16 |
| 77 | +# ---------------------------------------------- --------- ------------- ------ --------------------- ------------------------- |
| 78 | +# ceval-computer_network db9ce2 accuracy gen 47.37 47.37 |
| 79 | +# ceval-operating_system 1c2571 accuracy gen xx.xx xx.xx |
| 80 | +# ceval-computer_architecture a74dad accuracy gen xx.xx xx.xx |
| 81 | +# ceval-college_programming 4ca32a accuracy gen xx.xx xx.xx |
| 82 | +# ceval-college_physics 963fa8 accuracy gen xx.xx xx.xx |
| 83 | +# ceval-college_chemistry e78857 accuracy gen xx.xx xx.xx |
| 84 | +# ceval-advanced_mathematics ce03e2 accuracy gen xx.xx xx.xx |
| 85 | +# ... |
| 86 | +``` |
| 87 | + |
| 88 | +# FAQ |
| 89 | + |
| 90 | +### AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'? |
| 91 | +`pip install --force-reinstall transformers==4.33.0` |
| 92 | + |
| 93 | +### for `TruthfulQA` dataset, pls install deps with `bleurt` |
| 94 | +`pip install git+https://github.com/google-research/bleurt.git` |
| 95 | + |
| 96 | +### add env macro to control the test case, (XFT_ONLY_XFT & XFT_DTYPE_LIST & XFT_KVCACHE_DTYPE_LIST) |
| 97 | +``` |
| 98 | +# XFT_ONLY_XFT |
| 99 | +# This environment variable is used exclusively for testing with XFT, without testing with HF models. |
| 100 | +
|
| 101 | +# XFT_DTYPE_LIST contains a list like: |
| 102 | +# [ |
| 103 | +# "fp16", |
| 104 | +# "bf16", |
| 105 | +# "int8", |
| 106 | +# "w8a8", |
| 107 | +# "int4", |
| 108 | +# "nf4", |
| 109 | +# "bf16_fp16", |
| 110 | +# "bf16_int8", |
| 111 | +# "bf16_w8a8", |
| 112 | +# "bf16_int4", |
| 113 | +# "bf16_nf4", |
| 114 | +# "w8a8_int8", |
| 115 | +# "w8a8_int4", |
| 116 | +# "w8a8_nf4", |
| 117 | +# ] |
| 118 | +# For example, it can include a single parameter like XFT_DTYPE_LIST=bf16 |
| 119 | +# or multiple parameters separated by commas like XFT_DTYPE_LIST=bf16,fp16,int8. |
| 120 | +
|
| 121 | +# XFT_KVCACHE_DTYPE_LIST environment variable contains a list of data types used for XFT KV cache. |
| 122 | +# [ |
| 123 | +# "fp32", |
| 124 | +# "fp16", |
| 125 | +# "int8", |
| 126 | +# ] |
| 127 | +# For example, it can include a single parameter like XFT_KVCACHE_DTYPE_LIST=fp16 |
| 128 | +# or multiple parameters separated by commas like XFT_KVCACHE_DTYPE_LIST=fp32,fp16,int8. |
| 129 | +``` |
0 commit comments