Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to export Llama with past-key-values to ONNX #2204

Open
2 of 4 tasks
ahmedehabessa opened this issue Mar 2, 2025 · 4 comments
Open
2 of 4 tasks

Failed to export Llama with past-key-values to ONNX #2204

ahmedehabessa opened this issue Mar 2, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@ahmedehabessa
Copy link

ahmedehabessa commented Mar 2, 2025

System Info

google colab
python 3.11
optimum version

-> Collecting optimum
  Downloading optimum-1.24.0-py3-none-any.whl.metadata (21 kB)

Who can help?

Hello @michaelbenayoun ,
hope you are doing well,
Trying to export llama model with past key values to onnx failed while using google colab, would you please help with that ?

the exported model can not be verified successfully

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

Steps to reproduce:

! pip install optimum
! pip install onnx
! optimum-cli export onnx -m meta-llama/Llama-3.2-1B --task text-generation-with-past llama321b

Expected behavior

exported and verified onnx model

Exception: An error occured during validation, but the model was saved nonetheless at llama321b-optimun. Detailed error: Required inputs (['onnx::Gather_35']) are missing from input feed (['input_ids', 'attention_mask', 'position_ids', 'past_key_values.0.key', 'past_key_values.0.value', 'past_key_values.1.key', 'past_key_values.1.value', 'past_key_values.2.key', 'past_key_values.2.value', 'past_key_values.3.key', 'past_key_values.3.value', 'past_key_values.4.key', 'past_key_values.4.value', 'past_key_values.5.key', 'past_key_values.5.value', 'past_key_values.6.key', 'past_key_values.6.value', 'past_key_values.7.key', 'past_key_values.7.value', 'past_key_values.8.key', 'past_key_values.8.value', 'past_key_values.9.key', 'past_key_values.9.value', 'past_key_values.10.key', 'past_key_values.10.value', 'past_key_values.11.key', 'past_key_values.11.value', 'past_key_values.12.key', 'past_key_values.12.value', 'past_key_values.13.key', 'past_key_values.13.value', 'past_key_values.14.key', 'past_key_values.14.value', 'past_key_values.15.key', 'past_key_values.15.value'])..

@ahmedehabessa ahmedehabessa added the bug Something isn't working label Mar 2, 2025
@ahmedehabessa ahmedehabessa changed the title Failed to export Llama Failed to export Llama with past-key-values Mar 2, 2025
@ahmedehabessa ahmedehabessa changed the title Failed to export Llama with past-key-values Failed to export Llama with past-key-values to ONNX Mar 2, 2025
@xenova
Copy link
Contributor

xenova commented Mar 5, 2025

Hi there 👋 This was fixed in #2191, so could you install optimum from source with

pip install --upgrade git+https://github.com/huggingface/optimum.git

?

cc @echarlaix I think we should put out a new release for this because I also ran into this error in google colab (link).

@ahmedehabessa
Copy link
Author

Hello @xenova,

thanks for you respond, but it is not working also with me, the model is exported but it can not verify the exported model
would please double check that?

the following warning was generated

the maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:

Is that normal ?

@xenova
Copy link
Contributor

xenova commented Mar 11, 2025

What are the differences in values? Anything around 1e-4 should be fine.

@ahmedehabessa
Copy link
Author

myenv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:731: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if sequence_length != 1:
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
lm_head.weight: {'onnx::MatMul_5299'}
model.embed_tokens.weight: {'model.embed_tokens.weight'}
-[x] values not close enough, max diff: 0.0002803802490234375 (atol: 1e-05)
-[x] values not close enough, max diff: 3.4332275390625e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 4.291534423828125e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 4.76837158203125e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 5.8650970458984375e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 5.3882598876953125e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 8.678436279296875e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 6.4849853515625e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 5.435943603515625e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 3.337860107421875e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 4.76837158203125e-05 (atol: 1e-05)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:

  • logits: max diff = 0.0002803802490234375
  • present.4.key: max diff = 3.4332275390625e-05
  • present.6.key: max diff = 4.291534423828125e-05
  • present.7.key: max diff = 4.76837158203125e-05
  • present.9.key: max diff = 5.8650970458984375e-05
  • present.10.key: max diff = 5.3882598876953125e-05
  • present.11.key: max diff = 8.678436279296875e-05
  • present.12.key: max diff = 6.4849853515625e-05
  • present.13.key: max diff = 5.435943603515625e-05
  • present.14.key: max diff = 3.337860107421875e-05
  • present.15.key: max diff = 4.76837158203125e-05.
    The exported model was saved at: llama321b-optimum

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants