Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falcon-40B convert successfully but output_dir has not been created #355

Closed
JunxiChhen opened this issue Apr 10, 2024 · 9 comments
Closed

Comments

@JunxiChhen
Copy link
Contributor

Convert cmd:

python3 convert.py         --model_id tiiuae/falcon-40b         --output_dir /root/.cache/huggingface/hub/tiiuae/falcon-40b-ov         --stateful --precision FP16

Benchmarking cmd:

numactl -C 0-55 -m 0 python benchmark.py     -m /root/.cache/huggingface/hub/falcon-40b-ov/pytorch/dldt/FP16  -p "It is done" -n 3     -bs 1     -d CPU --torch_compile_backend openvino     -ic 128 --num_beams 1 -lc bfloat16_config.json 2>&1 | tee -a ./logs/0.log
  1. Model converting successfully to /root/.cache/huggingface/hub/tiiuae/falcon-40b-ov:
    image
  2. Model cannot be found during benchmarking:
    image
  3. Dir /root/.cache/huggingface/hub/tiiuae/falcon-40b-ov does not exist.
@eaidova
Copy link
Collaborator

eaidova commented Apr 19, 2024

@JunxiChhen I do not see in your logs that model successfully converted (here only logs begins), is there possibility that this directory does not have write permissions?

@JunxiChhen
Copy link
Contributor Author

We have write permission actually. We can successfully convert models in same directory including llama2, gpt-j, ... Could you have a try and check if there's same issue? Thanks. @eaidova

@eaidova
Copy link
Collaborator

eaidova commented Apr 23, 2024

@JunxiChhen Unfortunately, my working machine does not have enough RAM for conversion falcon-40b, so I can not check it, but if you can share more logs for conversion (better in text format instead of screenshot), it will be very helpful)

@JunxiChhen
Copy link
Contributor Author

benchmark_latency__bfloat16___04-23-24-06-25-42.log
Here is the log file.

@eaidova
Copy link
Collaborator

eaidova commented Apr 23, 2024

@JunxiChhen possibly it is some logging issue, but on my side I can not see the same behaviour. On my end model conversion failed with
RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 8 but got size 128 for tensor number 1 in the list.

I prepared fix for that huggingface/optimum-intel#685

@eaidova
Copy link
Collaborator

eaidova commented Apr 26, 2024

@JunxiChhen could you please check falcon with the latest openvino.genai? the fix in optimum-intel was merged, commit is moved in llm_bench requirements.txt

@JunxiChhen
Copy link
Contributor Author

Thanks. It passed when there's internet connection.
However, in offline mode, we met issue (We pre-downloaded the falocn model from huggingface):
image

@eaidova
Copy link
Collaborator

eaidova commented Apr 26, 2024

@JunxiChhen maybe setting of these variables may help avoiding reaching hf hub https://huggingface.co/docs/transformers/main/en/installation#offline-mode?

@JunxiChhen
Copy link
Contributor Author

We downgraded the transformers version to 4.39.1 and passed. Thanks @eaidova

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants