From 0443e79ce051577f83a0c0b6a82b5bf8a1b420f2 Mon Sep 17 00:00:00 2001 From: Kyle Gerard Felker Date: Fri, 27 Sep 2024 22:16:18 -0500 Subject: [PATCH] Re-add 2x missing CMake IMPL_ flags to the Wiki (#585) * Add ccmake recommendation * Re-add Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC * Re-add Kokkos_ENABLE_IMPL_HPX_ASYNC_DISPATCH * Add warning about IMPL flags * The --> Your * Update docs/source/keywords.rst Co-authored-by: Damien L-G * Update docs/source/keywords.rst Co-authored-by: Damien L-G * Update docs/source/keywords.rst Co-authored-by: Daniel Arndt * Update docs/source/keywords.rst Co-authored-by: Daniel Arndt * Update docs/source/keywords.rst Co-authored-by: Daniel Arndt * Elevate note on IMPL keywords to the top * Change link to Wiki, not github PR * Fix link * Improve changes * typo --------- Co-authored-by: Damien L-G Co-authored-by: Daniel Arndt Co-authored-by: Damien L-G --- docs/source/keywords.rst | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/docs/source/keywords.rst b/docs/source/keywords.rst index 96a1dbc4d..50843e592 100644 --- a/docs/source/keywords.rst +++ b/docs/source/keywords.rst @@ -9,6 +9,15 @@ CMake Keywords Recall that to set a keyword in CMake you used the syntax ``-Dkeyword_name=value``. +.. note:: + The ``ccmake`` graphical user interface offers a convenient way to explore + available CMake options and their current values. It may be more up to date + with the Kokkos version that you are using. + **A word of warning:** variables with names containing ``IMPL`` are private + implementation details. Avoid modifying these unless you have a deep + understanding of their implications and are aware that they might change + without notice. + This page is organized in four sections: @@ -206,6 +215,13 @@ Backend-specific options * Use unified memory (UM) by default for CUDA * ``OFF`` + * * ``Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC`` + * Use ``cudaMallocAsync`` (requires CUDA Toolkit version 11.2 or higher). This + optimization may improve performance in applications with multiple CUDA streams per device, but it + is known to be incompatible with MPI distributions built on older versions of UCX + and many Cray MPICH instances. See `known issues `_. + * (see below) + * * ``Kokkos_ENABLE_HIP_MULTIPLE_KERNEL_INSTANTIATIONS`` * Instantiate multiple kernels at compile time - improve performance but increase compile time * ``OFF`` @@ -217,10 +233,18 @@ Backend-specific options * * ``Kokkos_ENABLE_ATOMICS_BYPASS`` * Disable atomics when no host parallel nor device backend is enabled for Serial only builds (since Kokkos 4.3) * ``OFF`` - + + * * ``Kokkos_ENABLE_IMPL_HPX_ASYNC_DISPATCH`` + * Enable asynchronous dispatch for the HPX backend + * ``ON`` + ``Kokkos_ENABLE_CUDA_LAMBDA`` default value is ``OFF`` until 3.7 and ``ON`` since 4.0 +``Kokkos_ENABLE_IMPL_CUDA_MALLOC_ASYNC`` default value is ``OFF`` except in 4.2, 4.3, and 4.4 + + + Development ----------- These are intended for developers of Kokkos. If you are a user, you probably