Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VitMatte] Cannot export VitMatte model to ONNX #1795

Closed
2 of 4 tasks
SysDevHayes opened this issue Apr 1, 2024 · 3 comments · Fixed by huggingface/transformers#30065
Closed
2 of 4 tasks

[VitMatte] Cannot export VitMatte model to ONNX #1795

SysDevHayes opened this issue Apr 1, 2024 · 3 comments · Fixed by huggingface/transformers#30065

Comments

@SysDevHayes
Copy link

SysDevHayes commented Apr 1, 2024

System Info

transformers 4.39.2
optimum 1.16.0.dev0 # installed from https://github.com/huggingface/optimum/tree/add-vitmatte
torch  2.1.1+cu118
onnxruntime-gpu 1.16.3

Who can help?

@NielsRogge @xenova

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Run command:
optimum-cli export onnx --model hustvl/vitmatte-base-distinctions-646 o --task image-matting
`AnnotionFormat` is deprecated and will be removed in v4.38. Please use `transformers.image_utils.AnnotationFormat` instead.
Framework not specified. Using pt to export to ONNX.
Using the export variant default. Available variants are:
    - default: The default ONNX variant.
Using framework PyTorch: 2.1.1+cu118
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:118: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if num_channels != self.num_channels:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:100: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  size = int(math.sqrt(num_position))
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:101: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if size * size != num_position:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:104: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if size != height or size != width:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:411: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if pad_height > 0 or pad_width > 0:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  max_rel_dist = int(2 * max(q_size, k_size) - 1)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  max_rel_dist = int(2 * max(q_size, k_size) - 1)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:155: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if rel_pos.shape[0] != max_rel_dist:
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:167: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  q_coords = torch.arange(q_size)[:, None] * max(k_size / q_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:168: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  k_coords = torch.arange(k_size)[None, :] * max(q_size / k_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:169: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  relative_coords = (q_coords - k_coords) + (k_size - 1) * max(q_size / k_size, 1.0)
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:447: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if patch_height > height or patch_width > width:
Post-processing the exported models...
Weight deduplication check in the ONNX export requires accelerate. Please install accelerate to run it.
Validating ONNX model o/model.onnx...
	-[✓] ONNX model output names match reference model (alphas)
	- Validating ONNX Model output "alphas":
		-[✓] (2, 1, 64, 64) matches (2, 1, 64, 64)
		-[✓] all values close (atol: 1e-05)
The ONNX export succeeded and the exported model was saved at: o

Exactly the same output mentioned here. However, running the exported model gave errors:

File /venv3.10_onnx/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:220, in Session.run(self, output_names, input_feed, run_options)
    218     output_names = [output.name for output in self._outputs_meta]
    219 try:
--> 220     return self._sess.run(output_names, input_feed, run_options)
    221 except C.EPFail as err:
    222     if self._enable_fallback:

InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Gather node. Name:'/backbone/encoder/layer.2/attention/Gather_4' Status Message: indices element out of data bounds, idx=59 must be within the inclusive range [-7,6]

As suggested by Xenova, I have to look for Python cast of int(...) and float(...) and change them to .to(torch.int64) and .to(torch.float32). This means that the integrated code of VitMatte can be improved to make it ONNX exportable.

However, using vscode search function, I could not find any int(...) or float(...) cast that has direct relation to VitMatte or Vitdet. Can someone please point me to the right point where should I change the cast? Thank you so much!

Expected behavior

Exported ONNX model should be able to be inferenced, as shown here. Since this is not an issue with Optimum, please take a look and give me some guidances. Thank you so much!

@NielsRogge
Copy link

cc @xenova

@xenova
Copy link
Contributor

xenova commented Apr 1, 2024

As #1582 (comment), I have to look for Python cast of int(...) and float(...) and change them to .to(torch.int64) and .to(torch.float32). This means that the integrated code of VitMatte can be improved to make it ONNX exportable.

However, using vscode search function, I could not find any int(...) or float(...) cast that has direct relation to VitMatte or Vitdet. Can someone please point me to the right point where should I change the cast? Thank you so much!

As you can see from the warnings, there are a few casts you have missed:

/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:100: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  size = int(math.sqrt(num_position))
...
/media/my_random_things/python3.10_environment/venv3.10_onnx/lib/python3.10/site-packages/transformers/models/vitdet/modeling_vitdet.py:153: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  max_rel_dist = int(2 * max(q_size, k_size) - 1)

Fixing these casts will fix the issue. cc @fxmarty what is the recommended way to do this?

@fxmarty fxmarty transferred this issue from huggingface/transformers Apr 5, 2024
@fxmarty
Copy link
Contributor

fxmarty commented Apr 5, 2024

@xenova Didn't you fix it already #1582 (comment)? I can have a look though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants