Commit 0d00318
committed
feat(simd): Phase 1 — explicit cargo configs + AVX2 dispatch hardening
Implements Phase 1 of the integration plan in `.claude/knowledge/
simd-dispatch-architecture.md` (PR #171).
Changes
-------
1. `.cargo/config.toml` — set `target-cpu = "x86-64-v3"` for x86_64.
Previously the file declared "no global target-cpu", which compiled
binaries to x86-64 generic (SSE2). `simd_avx2::F32x16` and friends
wrap `__m256` / `__m256i` intrinsics that the runtime CPU never
executes under SSE2, producing the PR #170 SIGILL CI mode (38 tests
timing out uniformly at ~19s in `simd_avx2::*` / `simd_ops::*` /
`simd_soa::*`).
2. `.cargo/config-avx512.toml` (new) — explicit `x86-64-v4` for AVX-512
builds. Triggered by `cargo --config .cargo/config-avx512.toml`.
3. `.cargo/config-native.toml` (new) — `target-cpu = "native"` for
build-host-tuned binaries (developer machines). Non-portable.
4. `src/simd.rs` — tighten the AVX2 dispatch arm predicate from
`not(target_feature = "avx512f")` to
`target_feature = "avx2" + not(target_feature = "avx512f")`.
Belts-and-braces: under v3 the predicates are equivalent, but the
explicit `avx2` requirement means a future "build me without v3"
invocation lands on a compile error rather than a SIGILL at run
time. Stale "target-cpu=x86-64-v4 → AVX-512" comment refreshed to
describe the new three-config dispatch model.
Out of scope for this PR
------------------------
The architecture doc (PR #171) claimed Phase 1 also needed to "add
~10 missing AVX2 two-half wrappers". On survey those wrappers already
exist in `src/simd_avx2.rs`:
- `F32x16` / `F64x8` — true two-half AVX wrappers
- `U8x32` — native AVX2 `__m256i`
- `U8x64` / `I8x64` / `I16x32` / `I32x16` / `I64x8` / `U16x32` /
`U32x16` / `U64x8` — scalar polyfill via the
`avx2_int_type!` macro (storage =
`[$elem; $lanes]` align 64).
The matrix in the architecture doc will be corrected as a follow-up.
The parity gap that does exist (scalar-polyfill ints are not vectorized
under AVX2) is its own piece of tech debt, tracked separately.1 parent 207fc20 commit 0d00318
4 files changed
Lines changed: 73 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | | - | |
4 | | - | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
198 | 198 | | |
199 | 199 | | |
200 | 200 | | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
205 | 212 | | |
206 | 213 | | |
207 | 214 | | |
| |||
272 | 279 | | |
273 | 280 | | |
274 | 281 | | |
275 | | - | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
276 | 289 | | |
277 | 290 | | |
278 | | - | |
| 291 | + | |
279 | 292 | | |
280 | 293 | | |
281 | 294 | | |
| |||
0 commit comments