Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify accumulator type handling across CUB/Thrust #3993

Open
Tracked by #101
bernhardmgruber opened this issue Mar 3, 2025 · 3 comments
Open
Tracked by #101

Unify accumulator type handling across CUB/Thrust #3993

bernhardmgruber opened this issue Mar 3, 2025 · 3 comments
Assignees

Comments

@bernhardmgruber
Copy link
Contributor

bernhardmgruber commented Mar 3, 2025

I vaguely remember that not every reduce or scan algorithm uses ::cuda::std::__accumulator_t to determine the accumulator type to use. We should consolidate this behavior.

@bernhardmgruber
Copy link
Contributor Author

@gevtushenko said there is divergence between CUB and Thrust. CUB uses __accumulator_t, whereas Thrust uses the type of the initial value.

@bernhardmgruber
Copy link
Contributor Author

I had another look and realized that the C++ standard seems to determine the accumulator type to either be the iterator value type or the initial value type. So it seems the divergence between CUB and Thrust is fine.

@fbusato
Copy link
Contributor

fbusato commented Mar 5, 2025

This is also related and relevant to SIMD reduction. Using cuda::std::plus<> vs. cuda::std::plus<T> could affect performance. e.g. cuda::std::plus<> applied to int16_t induces implicit promotion which disables SIMD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants