Commit 7caefe9
committed
feat(simd): re-export f32_to_bf16_batch_rne / f32_to_bf16_scalar_rne
Makes the pure AVX-512-F RNE routines from commit c489d31 reachable
as `ndarray::simd::f32_to_bf16_batch_rne` and
`ndarray::simd::f32_to_bf16_scalar_rne` for consumer code in
lance-graph. Without this re-export, callers would have to reach
into the private `simd_avx512` module path, which is not `pub mod`
in `lib.rs`.
Doc comment on the re-export explicitly pins the workspace-wide
"never scalar ever" rule for F32→BF16: consumer hot loops use
`f32_to_bf16_batch_rne` exclusively (500-20,000× faster than scalar
via AMX/AVX-512-BF16 tiles), and `f32_to_bf16_scalar_rne` is exposed
only as a unit-test reference implementation. Cross-references the
Certification Process section in `lance-graph/CLAUDE.md`.
Companion commit in lance-graph updates `seven_lane_encoder.rs`
Lane 6 to call the batch primitive instead of its previous
element-wise truncation loop.
https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A1 parent c489d31 commit 7caefe9
1 file changed
Lines changed: 14 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
108 | 122 | | |
109 | 123 | | |
110 | 124 | | |
| |||
0 commit comments