Skip to content

Commit 284e104

Browse files
Run thrust transform benchmarks with more elements (NVIDIA#2764)
1 parent 576500e commit 284e104

File tree

1 file changed

+4
-2
lines changed
  • thrust/benchmarks/bench/transform

1 file changed

+4
-2
lines changed

thrust/benchmarks/bench/transform/basic.cu

+4-2
Original file line numberDiff line numberDiff line change
@@ -107,8 +107,10 @@ constexpr auto startB = 2; // BabelStream: 0.2
107107
constexpr auto startC = 3; // BabelStream: 0.1
108108
constexpr auto startScalar = 4; // BabelStream: 0.4
109109

110-
using element_types = nvbench::type_list<std::int8_t, std::int16_t, float, double, __int128>;
111-
auto array_size_powers = std::vector<std::int64_t>{25}; // BabelStream uses 2^25, H200 can fit 2^31
110+
using element_types = nvbench::type_list<std::int8_t, std::int16_t, float, double, __int128>;
111+
// Different benchmarks use a different number of buffers. H200/B200 can fit 2^31 elements for all benchmarks and types.
112+
// Upstream BabelStream uses 2^25. Allocation failure just skips the benchmark
113+
auto array_size_powers = std::vector<std::int64_t>{25, 31};
112114

113115
template <typename T>
114116
static void mul(nvbench::state& state, nvbench::type_list<T>)

0 commit comments

Comments
 (0)