Failed to export Llama with past-key-values to ONNX #2204

ahmedehabessa · 2025-03-02T09:45:59Z

System Info

google colab
python 3.11
optimum version

-> Collecting optimum
  Downloading optimum-1.24.0-py3-none-any.whl.metadata (21 kB)

Who can help?

Hello @michaelbenayoun ,
hope you are doing well,
Trying to export llama model with past key values to onnx failed while using google colab, would you please help with that ?

the exported model can not be verified successfully

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

Steps to reproduce:

! pip install optimum
! pip install onnx
! optimum-cli export onnx -m meta-llama/Llama-3.2-1B --task text-generation-with-past llama321b

Expected behavior

exported and verified onnx model

Exception: An error occured during validation, but the model was saved nonetheless at llama321b-optimun. Detailed error: Required inputs (['onnx::Gather_35']) are missing from input feed (['input_ids', 'attention_mask', 'position_ids', 'past_key_values.0.key', 'past_key_values.0.value', 'past_key_values.1.key', 'past_key_values.1.value', 'past_key_values.2.key', 'past_key_values.2.value', 'past_key_values.3.key', 'past_key_values.3.value', 'past_key_values.4.key', 'past_key_values.4.value', 'past_key_values.5.key', 'past_key_values.5.value', 'past_key_values.6.key', 'past_key_values.6.value', 'past_key_values.7.key', 'past_key_values.7.value', 'past_key_values.8.key', 'past_key_values.8.value', 'past_key_values.9.key', 'past_key_values.9.value', 'past_key_values.10.key', 'past_key_values.10.value', 'past_key_values.11.key', 'past_key_values.11.value', 'past_key_values.12.key', 'past_key_values.12.value', 'past_key_values.13.key', 'past_key_values.13.value', 'past_key_values.14.key', 'past_key_values.14.value', 'past_key_values.15.key', 'past_key_values.15.value'])..

The text was updated successfully, but these errors were encountered:

xenova · 2025-03-05T21:00:13Z

Hi there 👋 This was fixed in #2191, so could you install optimum from source with

pip install --upgrade git+https://github.com/huggingface/optimum.git

?

cc @echarlaix I think we should put out a new release for this because I also ran into this error in google colab (link).

ahmedehabessa · 2025-03-11T13:23:39Z

Hello @xenova,

thanks for you respond, but it is not working also with me, the model is exported but it can not verify the exported model
would please double check that?

the following warning was generated

the maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:

Is that normal ?

xenova · 2025-03-11T13:30:57Z

What are the differences in values? Anything around 1e-4 should be fine.

ahmedehabessa · 2025-03-11T13:51:55Z

myenv/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:731: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if sequence_length != 1:
Found different candidate ONNX initializers (likely duplicate) for the tied weights:
lm_head.weight: {'onnx::MatMul_5299'}
model.embed_tokens.weight: {'model.embed_tokens.weight'}
-[x] values not close enough, max diff: 0.0002803802490234375 (atol: 1e-05)
-[x] values not close enough, max diff: 3.4332275390625e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 4.291534423828125e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 4.76837158203125e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 5.8650970458984375e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 5.3882598876953125e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 8.678436279296875e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 6.4849853515625e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 5.435943603515625e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 3.337860107421875e-05 (atol: 1e-05)
-[x] values not close enough, max diff: 4.76837158203125e-05 (atol: 1e-05)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:

logits: max diff = 0.0002803802490234375
present.4.key: max diff = 3.4332275390625e-05
present.6.key: max diff = 4.291534423828125e-05
present.7.key: max diff = 4.76837158203125e-05
present.9.key: max diff = 5.8650970458984375e-05
present.10.key: max diff = 5.3882598876953125e-05
present.11.key: max diff = 8.678436279296875e-05
present.12.key: max diff = 6.4849853515625e-05
present.13.key: max diff = 5.435943603515625e-05
present.14.key: max diff = 3.337860107421875e-05
present.15.key: max diff = 4.76837158203125e-05.
The exported model was saved at: llama321b-optimum

ahmedehabessa added the bug Something isn't working label Mar 2, 2025

ahmedehabessa changed the title ~~Failed to export Llama~~ Failed to export Llama with past-key-values Mar 2, 2025

ahmedehabessa changed the title ~~Failed to export Llama with past-key-values~~ Failed to export Llama with past-key-values to ONNX Mar 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to export Llama with past-key-values to ONNX #2204

Failed to export Llama with past-key-values to ONNX #2204

ahmedehabessa commented Mar 2, 2025 •

edited

Loading

xenova commented Mar 5, 2025

ahmedehabessa commented Mar 11, 2025

xenova commented Mar 11, 2025

ahmedehabessa commented Mar 11, 2025

Failed to export Llama with past-key-values to ONNX #2204

Failed to export Llama with past-key-values to ONNX #2204

Comments

ahmedehabessa commented Mar 2, 2025 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

xenova commented Mar 5, 2025

ahmedehabessa commented Mar 11, 2025

xenova commented Mar 11, 2025

ahmedehabessa commented Mar 11, 2025

ahmedehabessa commented Mar 2, 2025 •

edited

Loading