Skip to content

Conversation

@connortsui20
Copy link
Contributor

@connortsui20 connortsui20 commented Dec 12, 2025

Brings back the portable_simd take implementation back, and instead of constraining by NativePType, this bounds by T: Copy and will cast to u8 - u64 depending on the size of the type.

This also adds a check for out-of-bounds indices that adds a single simd and bitwise instruction to the hot loop so we correctly panic at the end if there was an out of bounds.

It might be the case that separating out the &= and the simd_lt is so that the gather isn't depending on the &=, but if register pressure is high then we do not want to evict any data from the SIMD registers. I should probably benchmark that...

@connortsui20 connortsui20 requested review from a10y and gatesn December 12, 2025 21:57
@connortsui20 connortsui20 added the performance Release label indicating an improvement to performance label Dec 12, 2025
@connortsui20 connortsui20 marked this pull request as draft December 12, 2025 22:04
@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

❌ Patch coverage is 54.32099% with 37 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.68%. Comparing base (9a76489) to head (078ca7c).

Files with missing lines Patch % Lines
vortex-compute/src/take/slice/asm_stubs.rs 0.00% 18 Missing ⚠️
vortex-compute/src/take/slice/portable.rs 44.82% 16 Missing ⚠️
vortex-compute/src/take/slice/mod.rs 0.00% 2 Missing ⚠️
vortex-compute/src/take/slice/avx2.rs 96.87% 1 Missing ⚠️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@connortsui20 connortsui20 changed the title Perf: bring portable simd back and generalize by Copy Perf: bring SIMD take back and generalize by Copy Dec 12, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Dec 12, 2025

CodSpeed Performance Report

Merging #5722 will improve performances by 22.72%

Comparing ct/simd-portable-take (078ca7c) with develop (9a76489)

Summary

⚡ 20 improvements
✅ 1236 untouched
⏩ 621 skipped1

Benchmarks breakdown

Benchmark BASE HEAD Change
pvector_take_zipfian[16, 1000] 10.4 µs 9.3 µs +11.81%
pvector_take_zipfian[256, 100000] 674.8 µs 550 µs +22.7%
pvector_take_zipfian[2048, 10000] 74.6 µs 62.2 µs +19.88%
pvector_take_zipfian[2048, 100000] 678.3 µs 553.5 µs +22.56%
pvector_take_zipfian[8192, 10000] 85.8 µs 73.4 µs +16.82%
pvector_take_zipfian[256, 10000] 71.3 µs 59 µs +20.92%
pvector_take_zipfian[256, 1000] 10.8 µs 9.7 µs +11.33%
pvector_take_uniform[16, 100000] 674.4 µs 549.6 µs +22.71%
pvector_take_uniform[16, 10000] 70.5 µs 58.2 µs +21.21%
pvector_take_zipfian[8192, 100000] 700.7 µs 575.8 µs +21.68%
pvector_take_uniform[2048, 100000] 678.1 µs 553.3 µs +22.56%
pvector_take_uniform[256, 100000] 674.9 µs 550 µs +22.69%
pvector_take_uniform[2048, 10000] 74.9 µs 62.6 µs +19.67%
pvector_take_zipfian[16, 100000] 674.4 µs 549.5 µs +22.72%
pvector_take_uniform[16, 1000] 11.1 µs 10.1 µs +10.21%
pvector_take_uniform[256, 10000] 71.3 µs 59 µs +20.93%
pvector_take_uniform[256, 1000] 11.6 µs 10.5 µs +10.46%
pvector_take_zipfian[16, 10000] 70.9 µs 58.5 µs +21.08%
pvector_take_uniform[8192, 10000] 87.9 µs 75.6 µs +16.27%
pvector_take_uniform[8192, 100000] 717.4 µs 592.9 µs +21.01%

Footnotes

  1. 621 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@connortsui20
Copy link
Contributor Author

it might be the case that the portable simd is actually faster than the avx2 impl? I think we need to do some more directed benchmarks...

@connortsui20 connortsui20 force-pushed the ct/simd-portable-take branch 3 times, most recently from cc8671c to 5312d71 Compare December 13, 2025 17:49
add OOB check + safety comments

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Release label indicating an improvement to performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants