Enable OCP FP8 for Latest Archs #111

ScXfjiang · 2025-02-19T14:33:43Z

No description provided.

draganmladjenovic · 2025-03-10T01:13:00Z

xla/stream_executor/rocm/hip_blas_lt.cc

+    }
+
+    // hipBlasLt requires setting the a/b scale pointer (even a dummy one),
+    // otherwise no algorithms can be found for "a/b scaling". This is to be


Not true. What about d_scale and c_scale?

draganmladjenovic · 2025-03-10T01:16:09Z

xla/service/gpu/buffer_comparator.cu.cc

  float elem_a =
      __half2float(__nv_cvt_fp8_to_halfraw(buffer_a[idx], __NV_E4M3));
  float elem_b =
      __half2float(__nv_cvt_fp8_to_halfraw(buffer_b[idx], __NV_E4M3));
+#else  // TENSORFLOW_USE_ROCM && TF_ROCM_VERSION >= 60300
+  float elem_a =
+      __half2float(__hip_cvt_fp8_to_halfraw(buffer_a[idx], __HIP_E4M3));


We don't have an api that converts directly to float?

ScXfjiang force-pushed the rocm-jaxlib-v0.4.35-qa_enable_ocp_fp8_in_gemm_rewriter branch from 93c9277 to 4d3c1f5 Compare February 20, 2025 19:22

ScXfjiang marked this pull request as draft February 20, 2025 19:33

ScXfjiang force-pushed the rocm-jaxlib-v0.4.35-qa_enable_ocp_fp8_in_gemm_rewriter branch 6 times, most recently from c6320d5 to 61a5683 Compare February 25, 2025 17:47

ScXfjiang force-pushed the rocm-jaxlib-v0.4.35-qa_enable_ocp_fp8_in_gemm_rewriter branch 2 times, most recently from 538aed9 to d20c818 Compare March 5, 2025 19:11

ScXfjiang marked this pull request as ready for review March 5, 2025 23:25

ScXfjiang force-pushed the rocm-jaxlib-v0.4.35-qa_enable_ocp_fp8_in_gemm_rewriter branch from 646ad3f to a3a64ca Compare March 7, 2025 00:31

enable ocp fp8 for latest amd archs in gemm rewriter

187ca2e

ScXfjiang force-pushed the rocm-jaxlib-v0.4.35-qa_enable_ocp_fp8_in_gemm_rewriter branch from 9ed9fef to 187ca2e Compare March 7, 2025 15:07

draganmladjenovic reviewed Mar 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable OCP FP8 for Latest Archs #111

Enable OCP FP8 for Latest Archs #111

ScXfjiang commented Feb 19, 2025

draganmladjenovic Mar 10, 2025

draganmladjenovic Mar 10, 2025

Enable OCP FP8 for Latest Archs #111

Are you sure you want to change the base?

Enable OCP FP8 for Latest Archs #111

Conversation

ScXfjiang commented Feb 19, 2025

draganmladjenovic Mar 10, 2025

Choose a reason for hiding this comment

draganmladjenovic Mar 10, 2025

Choose a reason for hiding this comment