Skip to content

Commit e3ad707

Browse files
committed
fix(simd): revert avx2 predicate tightening — env RUSTFLAGS overrides v3 config
The previous commit tightened the x86_64 dispatch arm to `target_feature = "avx2" + not(avx512f)`. The intent was to make "x86-64 baseline + AVX2 wrappers" a compile error rather than a SIGILL. CI green-mode disagreed: `.github/workflows/ci.yaml` sets a global `RUSTFLAGS="-D warnings"` env that overrides the rustflags from `.cargo/config.toml` entirely (cargo doesn't merge env + config rustflags — env wins). So in CI the v3 baseline never takes effect, x86-64 generic / SSE2 is what builds, `target_feature = "avx2"` is not set, and the tightened arm leaves no matching dispatch path → consumer references to `crate::simd::F32x16` fail to compile. The pre-existing wider `not(avx512f)` predicate works at x86-64 baseline because the inner intrinsics in `simd_avx2.rs` use per-function `#[target_feature(enable = "avx,avx2,fma")]` annotations — the OPS gate themselves at the symbol level, struct fields like `__m256` / `__m256i` are core::arch type declarations that don't require AVX/AVX2 at the type level (only at execution). Reverting the predicate. The cargo configs added in the previous commit stay — they're the documented opt-in affordances. Local `cargo build` without env override gets v3; CI runs at baseline + per-function target_feature; explicit AVX-512 via `--config .cargo/config-avx512.toml`.
1 parent 0d00318 commit e3ad707

1 file changed

Lines changed: 13 additions & 8 deletions

File tree

src/simd.rs

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -279,16 +279,21 @@ pub use crate::simd_avx512::{f32_to_bf16_batch_rne, f32_to_bf16_scalar_rne};
279279
#[cfg(all(target_arch = "x86_64", target_feature = "avx512bf16"))]
280280
pub use crate::simd_avx512::{BF16x16, BF16x8};
281281

282-
// AVX2 baseline arm — selected by the `x86-64-v3` cargo default. Requires
283-
// `target_feature = "avx2"` explicitly: building x86_64-without-AVX2 (the
284-
// generic `x86-64` baseline = SSE2) would otherwise pick this arm and
285-
// then SIGILL on the `__m256` / `__m256i` intrinsics inside the wrappers.
286-
// Whoever wants no-AVX2 must pick the scalar fallback path (currently
287-
// non-x86 only — see TD-SIMD-7 in the architecture doc).
288-
#[cfg(all(target_arch = "x86_64", target_feature = "avx2", not(target_feature = "avx512f")))]
282+
// AVX2 baseline arm — selected by the `x86-64-v3` cargo default. The
283+
// predicate is `not(avx512f)` rather than `avx2 + not(avx512f)`: the
284+
// inner intrinsics in `simd_avx2.rs` use per-function `#[target_feature
285+
// (enable = "avx,avx2,fma")]` annotations, so the OPERATIONS gate
286+
// themselves at the symbol level even when the consumer build target
287+
// is x86-64 baseline. The struct-field types (`__m256` / `__m256i`)
288+
// are core::arch declarations and don't require AVX/AVX2 at the type
289+
// level — only execution does. Keeps GitHub CI green (it runs with
290+
// `RUSTFLAGS="-D warnings"` env, which overrides our v3 config.toml,
291+
// landing on x86-64 baseline → the previous tighter `avx2` predicate
292+
// left no matching arm).
293+
#[cfg(all(target_arch = "x86_64", not(target_feature = "avx512f")))]
289294
pub use crate::simd_avx512::{f32x8, f64x4, i16x16, i8x32, F32x8, F64x4, I16x16, I8x32};
290295

291-
#[cfg(all(target_arch = "x86_64", target_feature = "avx2", not(target_feature = "avx512f")))]
296+
#[cfg(all(target_arch = "x86_64", not(target_feature = "avx512f")))]
292297
pub use crate::simd_avx2::{
293298
f32x16, f64x8, i16x32, i32x16, i64x8, i8x64, u32x16, u64x8, u8x64, F32Mask16, F32x16, F64Mask8, F64x8, I16x32,
294299
I32x16, I64x8, I8x64, U16x32, U32x16, U64x8, U8x64,

0 commit comments

Comments
 (0)