[onnx_importer.py] Fix dim_value None not correctly processed and missing Float8E4M3FNUZType. #4037

giacs-epic · 2025-02-21T13:58:41Z

As per title. Changes tested on SHARK-TestSuite's alt_e2eshark.

Fix Float8E4M3FNUZType.

giacs-epic · 2025-02-21T13:59:52Z

python/torch_mlir/extras/onnx_importer.py

@@ -707,13 +704,14 @@ def type_proto_to_type(self, tp: onnx.TypeProto) -> IrType:

        tt = tp.tensor_type
        if tt.elem_type:
-            if not tt.shape:


This is never None (protobuf default initializes), but rather an empty TensorShapeProto, which corresponds to a valid shape.

zjgarvey

I'm also going to take a look at the CI failure and let you know what I figure out.

zjgarvey · 2025-02-21T17:54:29Z

python/torch_mlir/tools/import_onnx/__main__.py

    temp_inferred_file = temp_dir / "inferred.onnx"
+    onnx.save(raw_model, temp_raw_file, save_as_external_data=False)


This section precisely occurs when the provided model is large, so saving a temp file would be expensive. I'd prefer to only do this if it had actually been modified, so perhaps add a bool to track if the model got modified by previous arg specifications, and only do this if so.

I'm also concerned about not saving external data in this case, since this is exactly when we would be exceeding the 2gb protobuf limit.

zjgarvey · 2025-02-21T18:22:56Z

Running

projects/pt1/tools/e2e_test.sh -c onnx -f ElementwiseFlattenBroadcastModule_basic -v

With this patch:

Crashes with:

projects/pt1/tools/e2e_test.sh: line 13: 38611 Floating point exception(core dumped) python -m e2e_testing.main "$@"

IR dump:

module {
  func.func @main_graph(%arg0: !torch.vtensor<[?],f32>, %arg1: !torch.vtensor<[],f32>) -> !torch.vtensor<[?],f32> attributes {torch.onnx_meta.ir_version = 9 : si64, torch.onnx_meta.opset_version = 20 : si64, torch.onnx_meta.producer_name = "pytorch", torch.onnx_meta.producer_version = "2.7.0"} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.Shape"(%arg1) : (!torch.vtensor<[],f32>) -> !torch.vtensor<[?],si64> 
    %1 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %2 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__1> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %3 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__2> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %4 = torch.operator "onnx.Slice"(%0, %2, %3, %1) : (!torch.vtensor<[?],si64>, !torch.vtensor<[1],si64>, !torch.vtensor<[1],si64>, !torch.vtensor<[1],si64>) -> !torch.vtensor<[?],si64> 
    %5 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__3> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %6 = torch.operator "onnx.Concat"(%4, %5) {torch.onnx.axis = 0 : si64} : (!torch.vtensor<[?],si64>, !torch.vtensor<[1],si64>) -> !torch.vtensor<[1],si64> 
    %7 = torch.operator "onnx.Reshape"(%arg1, %6) {torch.onnx.allowzero = 0 : si64} : (!torch.vtensor<[],f32>, !torch.vtensor<[1],si64>) -> !torch.vtensor<[?],f32> 
    %8 = torch.operator "onnx.Mul"(%arg0, %7) : (!torch.vtensor<[?],f32>, !torch.vtensor<[?],f32>) -> !torch.vtensor<[?],f32> 
    return %8 : !torch.vtensor<[?],f32>
  }
}

{-#
  dialect_resources: {
    builtin: {
      _: "0x080000000000000000000000",
      __1: "0x080000000000000000000000",
      __2: "0x08000000FFFFFFFFFFFFFFFF",
      __3: "0x08000000FFFFFFFFFFFFFFFF"
    }
  }
#-}

On main:

module {
  func.func @main_graph(%arg0: !torch.vtensor<[?],f32>, %arg1: !torch.vtensor<[],f32>) -> !torch.vtensor<[?],f32> attributes {torch.onnx_meta.ir_version = 9 : si64, torch.onnx_meta.opset_version = 20 : si64, torch.onnx_meta.producer_name = "pytorch", torch.onnx_meta.producer_version = "2.7.0"} {
    %none = torch.constant.none
    %0 = torch.operator "onnx.Shape"(%arg1) : (!torch.vtensor<[],f32>) -> !torch.vtensor<[0],si64> 
    %1 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %2 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__1> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %3 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__2> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %4 = torch.operator "onnx.Slice"(%0, %2, %3, %1) : (!torch.vtensor<[0],si64>, !torch.vtensor<[1],si64>, !torch.vtensor<[1],si64>, !torch.vtensor<[1],si64>) -> !torch.vtensor<[0],si64> 
    %5 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<__3> : tensor<1xsi64>} : () -> !torch.vtensor<[1],si64> 
    %6 = torch.operator "onnx.Concat"(%4, %5) {torch.onnx.axis = 0 : si64} : (!torch.vtensor<[0],si64>, !torch.vtensor<[1],si64>) -> !torch.vtensor<[1],si64> 
    %7 = torch.operator "onnx.Reshape"(%arg1, %6) {torch.onnx.allowzero = 0 : si64} : (!torch.vtensor<[],f32>, !torch.vtensor<[1],si64>) -> !torch.vtensor<[?],f32> 
    %8 = torch.operator "onnx.Mul"(%arg0, %7) : (!torch.vtensor<[?],f32>, !torch.vtensor<[?],f32>) -> !torch.vtensor<[?],f32> 
    return %8 : !torch.vtensor<[?],f32>
  }
}

{-#
  dialect_resources: {
    builtin: {
      _: "0x080000000000000000000000",
      __1: "0x080000000000000000000000",
      __2: "0x08000000FFFFFFFFFFFFFFFF",
      __3: "0x08000000FFFFFFFFFFFFFFFF"
    }
  }
#-}

I'm going to print out the onnx model and take a look at the shape information to see what is going on.

zjgarvey · 2025-02-21T18:37:22Z

Looking at a printout of the value info for the first shape node:

  value_info {
    name: "/Shape_output_0"
    type {
      tensor_type {
        elem_type: 7
        shape {
          dim {
            dim_value: 0
          }
        }
      }
    }
  }

So it looks like since the dim_value is actually 0, and not because it is unspecified, but because it is supposed to be zero.

I'm not sure if I have a good idea of how to fix this.

Do you know if shape inference will fill the value info fields with dim_params for actual dynamic dims?

Or is there python protobuf api for checking if a field is not explicitly set?

`

Fix dim_value None not correctly processed.

7e8d2be

Fix Float8E4M3FNUZType.

giacs-epic commented Feb 21, 2025

View reviewed changes

giacs-epic mentioned this pull request Feb 21, 2025

Bring the C++ ONNX importer on par with onnx_importer.py #3960

Open

zjgarvey self-requested a review February 21, 2025 17:14

zjgarvey reviewed Feb 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onnx_importer.py] Fix dim_value None not correctly processed and missing Float8E4M3FNUZType. #4037

[onnx_importer.py] Fix dim_value None not correctly processed and missing Float8E4M3FNUZType. #4037

giacs-epic commented Feb 21, 2025

giacs-epic Feb 21, 2025

zjgarvey left a comment

zjgarvey Feb 21, 2025

zjgarvey commented Feb 21, 2025

zjgarvey commented Feb 21, 2025

		temp_inferred_file = temp_dir / "inferred.onnx"
		onnx.save(raw_model, temp_raw_file, save_as_external_data=False)

[onnx_importer.py] Fix dim_value None not correctly processed and missing Float8E4M3FNUZType. #4037

Are you sure you want to change the base?

[onnx_importer.py] Fix dim_value None not correctly processed and missing Float8E4M3FNUZType. #4037

Conversation

giacs-epic commented Feb 21, 2025

giacs-epic Feb 21, 2025

Choose a reason for hiding this comment

zjgarvey left a comment

Choose a reason for hiding this comment

zjgarvey Feb 21, 2025

Choose a reason for hiding this comment

zjgarvey commented Feb 21, 2025

With this patch:

On main:

zjgarvey commented Feb 21, 2025