Skip to content

Commit

Permalink
Fixing issue with pragma unroll in laswp
Browse files Browse the repository at this point in the history
Signed-off-by: Luc Berger-Vergiat <lberge@sandia.gov>
  • Loading branch information
lucbv committed Mar 6, 2025
1 parent 23efb6c commit 6713c36
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions batched/dense/impl/KokkosBatched_Laswp_Serial_Internal.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -97,10 +97,12 @@ struct SerialLaswpVectorBackwardInternal {
// On H100 with Cuda 12.0.0, the compiler seems to apply
// an aggressive optimization which crashes this function
// Disabling loop unrolling fixes the issue
#if defined(KOKKOS_ENABLE_PRAGMA_UNROLL)
#if defined(KOKKOS_ENABLE_CUDA) && defined(KOKKOS_ARCH_HOPPER90)
#if CUDA_VERSION >= 12000 && CUDA_VERSION < 12100
#pragma unroll 1
#endif
#endif
#endif
for (int i = (plen - 1); i >= 0; --i) {
const int piv = p[i * ps0];
Expand Down

0 comments on commit 6713c36

Please sign in to comment.