Int4(llama-7b-chat) converted model generates response with German words #207

bhardwaj-nakul · 2024-02-02T19:17:42Z

I have converted the llama-7b-chat model to int4 using following commands:

python convert.py --model_id meta-llama/Llama-2-7b-chat-hf --output_dir models/llama-2-7b-chat --precision FP16 --compress_weights INT4_SYM INT4_ASYM 4BIT_DEFAULT

python convert.py --model_id meta-llama/Llama-2-7b-chat-hf --output_dir models/llama-2-7b-chat --precision FP32 --compress_weights 4BIT_DEFAULT

I'm running benchmarking with the int4 converted models. I tried with following variations and as you can see all the responses contain German words.

OV_FP16-INT4_ASYM

OV_FP16-INT4_SYM

Tried with different prompt - It is giving partial answer in German.

OV_FP16-4BIT_DEFAULT

OV_FP32-4BIT_DEFAULT

Using the following prompt generates the complete response in German.

Am I missing something here ? Please provide some guidance.

The text was updated successfully, but these errors were encountered:

AlexKoff88 · 2024-02-12T08:01:18Z

@bhardwaj-nakul, last week, we added load_in_4bit feature that enables data-aware weight quantization. Please take a look at this PR in Optimum-Intel.
You can use optimum API directly in this case. Please consider installing it from github.

peterchen-intel · 2024-07-02T08:28:01Z

Close inactive issue.

peterchen-intel closed this as completed Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Int4(llama-7b-chat) converted model generates response with German words #207

Int4(llama-7b-chat) converted model generates response with German words #207

bhardwaj-nakul commented Feb 2, 2024 •

edited

Loading

AlexKoff88 commented Feb 12, 2024

peterchen-intel commented Jul 2, 2024

Int4(llama-7b-chat) converted model generates response with German words #207

Int4(llama-7b-chat) converted model generates response with German words #207

Comments

bhardwaj-nakul commented Feb 2, 2024 • edited Loading

AlexKoff88 commented Feb 12, 2024

peterchen-intel commented Jul 2, 2024

bhardwaj-nakul commented Feb 2, 2024 •

edited

Loading