You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you please run the following command and share the result with us?
python\benchmark_genai>python ./benchmark_genai.py -m Llama-2-7B-Chat-GPTQ -d CPU
@johnysh could you follow the updated OpenVINO guide to re-measure the performance on NPU device? Also we suggest considering to use BEST_PERF option to ensure best possible performance at lower compilation speed, check the performance modes section here. Kindly share your results.
[OS] Win11
[Platform]: Intel(R) Core(TM) Ultra 7 258V 2.20 GHz
[RAM]: 32GB
[NPU driver]: 32.0.100.3104
ENV:
https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide-npu.html
pip install nncf==2.12 onnx==1.16.1 optimum-intel==1.19.0
pip install openvino==2024.6 openvino-tokenizers==2024.6 openvino-genai==2024.6
PIP LIST:
openvino 2024.6.0
openvino-genai 2024.6.0.0
openvino-telemetry 2024.5.0
openvino-tokenizers 2024.6.0.0
optimum 1.23.3
optimum-intel 1.19.0
Code:
https://github.com/openvinotoolkit/openvino.genai
CMD:
optimum-cli export openvino -m TheBloke/Llama-2-7B-Chat-GPTQ Llama-2-7B-Chat-GPTQ
python\benchmark_genai>python ./benchmark_genai.py -m Llama-2-7B-Chat-GPTQ -d NPU
Result:
The text was updated successfully, but these errors were encountered: