Add nonstable env DeviceSegmentedSort::Pairs 2/2#8286
Add nonstable env DeviceSegmentedSort::Pairs 2/2#8286gonidelis wants to merge 3 commits intoNVIDIA:mainfrom
DeviceSegmentedSort::Pairs 2/2#8286Conversation
DeviceSegmentedSort::Pairs
🥳 CI Workflow Results🟩 Finished in 1h 33m: Pass: 100%/249 | Total: 3d 12h | Max: 1h 14m | Hits: 95%/160650See results here. |
| DECLARE_LAUNCH_WRAPPER(cub::DeviceSegmentedSort::SortPairs, sort_pairs); | ||
| DECLARE_LAUNCH_WRAPPER(cub::DeviceSegmentedSort::SortPairsDescending, sort_pairs_descending); | ||
|
|
||
| // %PARAM% TEST_LAUNCH lid 0:1:2 |
There was a problem hiding this comment.
Important: we should keep this at lid 0:1 for now. The current lid_2 coverage here only uses 2 segments, so it stays below partitioning_threshold and exercises the fallback path which is graph capturable. DeviceSegmentedSort still switches to a partitioning path for larger segment counts, and that path still does a host MemcpyAsync followed by SyncStream(stream), so this does not yet show general graph-capture support. The existing segmented sort tests in cub/test/catch2_test_device_segmented_sort_pairs.cu:10 and cub/test/catch2_test_device_segmented_sort_keys.cu:17 still keep lid_2 disabled for that reason, so this file should stay consistent unless we also prove the partitioning path is capture-safe.
DeviceSegmentedSort::Pairs DeviceSegmentedSort::Pairs 2/2
Split 2/2 from from #8003 for easier review