Recover `warp_shuffle` original behavior (revert #8210) by fbusato · Pull Request #8254 · NVIDIA/cccl

fbusato · 2026-04-01T01:08:30Z

Description

#8210 enforces additional constraints to the type allowed in cuda::device::warp_shuffle_*, namely default contractible and trivially copyable.

While this is conceptually correct, the new constraints prevent using warp shuffle instructions with types that practically satisfy these property but where the type traits fail, e.g. __half, __nv_bfloat16, other reduced precision floating-point types, composition of them like array and structures. Enforcing such constraints prevent using them in many context, e.g. CUB.

This PR reverts the original behavior until we don't find a reliable way to prevent the problem.

jrhemstad · 2026-04-01T01:20:37Z

libcudacxx/include/cuda/__warp/warp_shuffle.h

 [[nodiscard]] _CCCL_DEVICE_API warp_shuffle_result<_Up> warp_shuffle_idx(
  const _Tp& __data, int __src_lane, uint32_t __lane_mask = 0xFFFFFFFF, ::cuda::std::integral_constant<int, _Width> = {})
 {
-  static_assert(::cuda::std::is_default_constructible_v<_Tp>, "_Tp must be default constructible");


question: Instead of wholesale removing these checks, should we just add explicit exceptions for known types? With the ability to allow people to proclaim types as valid for use with these APIs?

As @fbusato explained to me, the problem is that if there is a struct containing __half, it won't be trivially copyable.. However, we do the same think for cuda::std::bit_cast and noone has complained yet.

But I think we should keep at least the requirement on default constructibility.

I would second default_constructability, because that is a much clearer error message than what a C++ compiler generates 5 lines below

I'd recommend taking a look at what we did in cuCollections by offering a is_bitwise_comparable custom trait. By default, we use has_unique_object_representation<T>, but that is false for floating-point values due to NaNs. However, for the majority of use cases that doesn't matter, and so we allow an escape hatch of specializing is_bitwise_comparable to opt-in. We emit a helpful diagnostic when this situation arises pointing people towards specializing is_bitwise_comparable.

We could do something similar here.

https://github.com/NVIDIA/cuCollections/blob/6477be2182668015f9a91e3a0bb7e248eceecd09/include/cuco/utility/traits.hpp#L24-L60

the idea is a bit invasive but nice. The problem affects other warp instructions as well, so this solution applies to all of them. We can specialize the new type traits for reduced precision floating points + array.

I opened an RFE for the compiler nvbug 5497120 a while ago. We can rely on the proposed solution until we don't get an official workaround.

the funny aspect is that has_unique_object_representation<T> recognizes __half, __nv_bfloat16 as unique object representation, while this is not the case

github-actions · 2026-04-02T23:48:00Z

🥳 CI Workflow Results

🟩 Finished in 1h 49m: Pass: 100%/99 | Total: 2d 03h | Max: 1h 30m | Hits: 94%/257973

See results here.

recover warp_shuffle original behavior

1fb728d

fbusato added this to CCCL Apr 1, 2026

fbusato requested a review from a team as a code owner April 1, 2026 01:08

fbusato added the libcu++ For all items related to libcu++ label Apr 1, 2026

fbusato requested a review from davebayer April 1, 2026 01:08

github-project-automation bot moved this to Todo in CCCL Apr 1, 2026

cccl-authenticator-app bot moved this from Todo to In Review in CCCL Apr 1, 2026

jrhemstad reviewed Apr 1, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

This was referenced Apr 1, 2026

cuda::is_trivially_copyable_relaxed #8265

Open

cuda::is_bitwise_comparable #8270

Open

revert is_default_constructible_v

47ccd21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recover `warp_shuffle` original behavior (revert #8210)#8254

Recover `warp_shuffle` original behavior (revert #8210)#8254
fbusato wants to merge 2 commits intoNVIDIA:mainfrom
fbusato:revert-warp-shuffle-constraints

fbusato commented Apr 1, 2026

Uh oh!

jrhemstad Apr 1, 2026

Uh oh!

davebayer Apr 1, 2026

Uh oh!

miscco Apr 1, 2026

Uh oh!

jrhemstad Apr 1, 2026 •

edited

Loading

Uh oh!

fbusato Apr 1, 2026

Uh oh!

fbusato Apr 1, 2026

Uh oh!

This comment has been minimized.

github-actions bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

fbusato commented Apr 1, 2026

Description

Uh oh!

jrhemstad Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

davebayer Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

miscco Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

jrhemstad Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fbusato Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

fbusato Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

github-actions bot commented Apr 2, 2026

🥳 CI Workflow Results

🟩 Finished in 1h 49m: Pass: 100%/99 | Total: 2d 03h | Max: 1h 30m | Hits: 94%/257973

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jrhemstad Apr 1, 2026 •

edited

Loading