[Eval] Add eval test with opencompass.

marvin-Yu · marvin-Yu · commit f843794f9f73 · 2024-04-28T03:31:27.000-04:00
diff --git a/examples/opencompass/README.md b/examples/opencompass/README.md
@@ -0,0 +1,86 @@
+# xFT Accuracy Evalution with opencompass
+OpenCompass is an LLM evaluation platform, supporting a wide range of models (InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets. more details information can refer to [https://opencompass.org.cn/](https://opencompass.org.cn/)
+
+## Installation
+Below are the steps for quick installation and datasets preparation.
+
+### Environment Setup
+``` bash
+# setup steps is refer to https://opencompass.org.cn/doc
+$ conda create -n opencompass python=3.10 pytorch torchvision torchaudio cpuonly -c pytorch -y
+$ conda activate opencompass
+
+$ git clone -b intel/xft https://github.com/marvin-Yu/opencompass.git && cd opencompass
+$ pip install -e .
+```
+
+### Data Preparation
+``` bash
+# download core dataset
+$ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
+
+$ unzip OpenCompassData-core-20240207.zip
+
+# # download full dataset
+# $ wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-complete-20240207.zip
+# $ unzip OpenCompassData-complete-20240207.zip
+# $ cd ./data
+# $ find . -name "*.zip" -exec unzip "{}" \;
+```
+
+### Model Preparation
+Download model weights to the `/data/models` directory (path configuration is not supported temporarily; you can modify the configuration file or create symbolic links). For example, to test the model `chatglm2_6b`:
+```bash
+/data/models/
+├── chatglm2-6b-hf
+├── chatglm2-6b-hf-xft
+├── ...
+```
+For exporting xFT models, please refer to [xFT Models Preparation](https://github.com/intel/xFasterTransformer?tab=readme-ov-file#models-preparation).
+
+### xFT Evaluation
+``` bash
+# list all xFT support models
+$ python tools/list_configs.py xft
+# +------------------------+----------------------------------------------+
+# | Model                  | Config Path                                  |
+# |------------------------+----------------------------------------------|
+# | xft_llama2_13b_chat    | configs/models/xft/xft_llama2_13b_chat.py    |
+# | xft_llama2_70b_chat    | configs/models/xft/xft_llama2_70b_chat.py    |
+# | xft_llama2_7b_chat     | configs/models/xft/xft_llama2_7b_chat.py     |
+# | xft_chatglm2_6b        | configs/models/xft/xft_chatglm2_6b.py        |
+# | xft_chatglm3_6b        | configs/models/xft/xft_chatglm3_6b.py        |
+# | xft_chatglm_6b         | configs/models/xft/xft_chatglm_6b.py         |
+# | xft_gemma_2b_it        | configs/models/xft/xft_gemma_2b_it.py        |
+# | xft_gemma_7b_it        | configs/models/xft/xft_gemma_7b_it.py        |
+# | ...............        | .....................................        |
+# +------------------------+----------------------------------------------+
+
+# list dataset than you want.
+$ python tools/list_configs.py ceval
+# +--------------------------------+------------------------------------------------------------------+
+# | Dataset                        | Config Path                                                      |
+# |--------------------------------+------------------------------------------------------------------|
+# | ceval_gen                      | configs/datasets/ceval/ceval_gen.py                              |
+# | ceval_gen_5f30c7               | configs/datasets/ceval/ceval_gen_5f30c7.py                       |
+# | ceval_ppl                      | configs/datasets/ceval/ceval_ppl.py                              |
+# | ceval_ppl_93e5ce               | configs/datasets/ceval/ceval_ppl_93e5ce.py                       |
+# | ...............                | .............................................                    |
+# +--------------------------------+------------------------------------------------------------------+
+
+# run eval test
+$ python run.py --models xft_chatglm2_6b --datasets ceval_gen
+# 20240416_100621
+# tabulate format
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+# dataset                                         version    metric         mode     xft_chatglm2_6b-bf16   xft_chatglm2_6b-xft-bf16
+# ----------------------------------------------  ---------  -------------  ------  ---------------------  -------------------------
+# ceval-computer_network                          db9ce2     accuracy       gen                     47.37                      47.37
+# ceval-operating_system                          1c2571     accuracy       gen                     xx.xx                      xx.xx
+# ceval-computer_architecture                     a74dad     accuracy       gen                     xx.xx                      xx.xx
+# ceval-college_programming                       4ca32a     accuracy       gen                     xx.xx                      xx.xx
+# ceval-college_physics                           963fa8     accuracy       gen                     xx.xx                      xx.xx
+# ceval-college_chemistry                         e78857     accuracy       gen                     xx.xx                      xx.xx
+# ceval-advanced_mathematics                      ce03e2     accuracy       gen                     xx.xx                      xx.xx
+# ...
+```