Skip to content

Commit 9ba929d

Browse files
committed
Verified baichuan2-7b-chat with GenAI text_generation, added it to Github workflow and README
1 parent 99f9a32 commit 9ba929d

File tree

2 files changed

+49
-6
lines changed

2 files changed

+49
-6
lines changed

.github/workflows/causal_lm_cpp.yml

+42-1
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,47 @@ jobs:
244244
source ./ov/setupvars.sh
245245
convert_tokenizer ./Qwen1.5-7B-Chat/pytorch/dldt/FP16/ --output ./Qwen1.5-7B-Chat/pytorch/dldt/FP16/ --with-detokenizer --trust-remote-code
246246
timeout 50s ./build/beam_search_causal_lm ./Qwen1.5-7B-Chat/pytorch/dldt/FP16/ "你好!" > ./pred_qwen15.txt
247+
cpp-beam_search_causal_lm-Baichuan2-7B-Chat:
248+
runs-on: ubuntu-20.04-16-cores
249+
steps:
250+
- uses: actions/checkout@v4
251+
with:
252+
submodules: recursive
253+
- uses: actions/setup-python@v4
254+
with:
255+
python-version: 3.8
256+
- name: Install OpenVINO
257+
run: |
258+
mkdir ./ov/
259+
curl https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2024.1.0-14645-e6dc0865128/l_openvino_toolkit_ubuntu20_2024.1.0.dev20240304_x86_64.tgz | tar --directory ./ov/ --strip-components 1 -xz
260+
sudo ./ov/install_dependencies/install_openvino_dependencies.sh
261+
- name: Download, convert and build
262+
run: |
263+
source ./ov/setupvars.sh
264+
python -m pip install --upgrade-strategy eager "optimum>=1.14" -r ./llm_bench/python/requirements.txt ./thirdparty/openvino_tokenizers/[transformers] --extra-index-url https://download.pytorch.org/whl/cpu && python ./llm_bench/python/convert.py --model_id baichuan-inc/Baichuan2-7B-Chat --output_dir ./Baichuan2-7B-Chat/ --precision FP16 &
265+
cmake -DCMAKE_BUILD_TYPE=Release -S ./text_generation/causal_lm/cpp/ -B ./build/
266+
cmake --build ./build/ --config Release -j
267+
wait
268+
- name: Run and Compare
269+
run: |
270+
source ./ov/setupvars.sh
271+
convert_tokenizer ./Baichuan2-7B-Chat/pytorch/dldt/FP16/ --output ./Baichuan2-7B-Chat/pytorch/dldt/FP16/ --with-detokenizer --trust-remote-code
272+
timeout 50s ./build/beam_search_causal_lm ./Baichuan2-7B-Chat/pytorch/dldt/FP16/ "69" > ./pred_baichuan2.txt
273+
python -c "
274+
import transformers
275+
with open('pred_baichuan2.txt', 'r') as file:
276+
predictions = file.read()
277+
278+
tokenizer = transformers.AutoTokenizer.from_pretrained('baichuan-inc/Baichuan2-7B-Chat',trust_remote_code=True)
279+
tokenized = tokenizer('69', return_tensors='pt')
280+
for beam in transformers.AutoModelForCausalLM.from_pretrained('baichuan-inc/Baichuan2-7B-Chat',trust_remote_code=True).generate(**tokenized, num_beam_groups=3, num_beams=15, num_return_sequences=15, diversity_penalty=1.0, max_new_tokens=20, early_stopping=False, length_penalty=1.0, no_repeat_ngram_size=9**9, do_sample=False):
281+
ref = tokenizer.decode(beam[tokenized['input_ids'].numel():], skip_special_tokens=True)
282+
idx = predictions.find(ref)
283+
if -1 == idx:
284+
raise RuntimeError(f'Missing "{ref=}" from predictions')
285+
predictions = predictions[:idx] + predictions[idx + len(ref):]
286+
"
287+
echo 69 passed
247288
cpp-beam_search_causal_lm-Phi-2:
248289
runs-on: ubuntu-20.04-16-cores
249290
steps:
@@ -376,4 +417,4 @@ jobs:
376417
raise RuntimeError(f'Missing "{ref=}" from predictions')
377418
predictions = predictions[:idx] + predictions[idx + len(ref):]
378419
"
379-
echo Phi-1_5 passed
420+
echo Phi-1_5 passed

text_generation/causal_lm/cpp/README.md

+7-5
Original file line numberDiff line numberDiff line change
@@ -134,14 +134,16 @@ To enable Unicode characters for Windows cmd open `Region` settings from `Contro
134134
4. https://huggingface.co/Qwen/Qwen1.5-7B-Chat-GPTQ-Int4
135135
[Qwen-7B-Chat-Int4 - Torch not compiled with CUDA enabled](../../../llm_bench/python/doc/NOTES.md#qwen-7b-chat-int4---torch-not-compiled-with-cuda-enabled)
136136
in case of `AssertionError`
137-
7. Dolly
137+
7. Baichuan
138+
1. https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat
139+
8. Dolly
138140
1. https://huggingface.co/databricks/dolly-v2-3b
139-
8. Phi
141+
9. Phi
140142
1. https://huggingface.co/microsoft/phi-2
141143
2. https://huggingface.co/microsoft/phi-1_5
142-
9. [notus-7b-v1](https://huggingface.co/argilla/notus-7b-v1)
143-
10. [zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
144-
11. [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
144+
10. [notus-7b-v1](https://huggingface.co/argilla/notus-7b-v1)
145+
11. [zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
146+
12. [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
145147

146148

147149
This pipeline can work with other similar topologies produced by `optimum-intel` with the same model signature.

0 commit comments

Comments
 (0)