You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I attempted to optimize the execution of the LLM on the NPU by configuring different group sizes, model types, and cache settings, but the results were still slower than on the CPU.
The text was updated successfully, but these errors were encountered:
yang-ahuan
changed the title
[NPU] NPU is slower than CPU&GPU when running LLM
[NPU][Llama] NPU is slower than CPU&GPU when running LLM
Mar 11, 2025
I was also able to reproduce the NPU being significantly slower than the CPU and iGPU on a Galaxy Book5 Pro with an Intel® Core™ Ultra 5 Processor 228V.
Here are the test results:
Description
I followed the official guidelines (NPU with OpenVINO GenAI) to run LLM. When testing the model on different hardware, I observed the following:
Is this behavior expected, or could there be an issue with my setup? I would appreciate any insights into this.
Experiment
Environment
Testing model
Testing code
Result
Try to Solve
I attempted to optimize the execution of the LLM on the NPU by configuring different group sizes, model types, and cache settings, but the results were still slower than on the CPU.
The text was updated successfully, but these errors were encountered: