Skip to content

Commit bfb356c

Browse files
authored
Merge pull request #156 from AdaWorldAPI/claude/w3-w6-soa-aos-helpers
W3-W6: SoA/AoS layout helpers — SoaVec + soa_struct! + aos_to_soa + soa_to_aos + bulk_apply (scalar; SIMD deferred)
2 parents f1d3303 + 7ba653c commit bfb356c

7 files changed

Lines changed: 2053 additions & 0 deletions

File tree

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# KNOWLEDGE: Cognitive Distance Metrics Are Typed — No Generic Umbrella, No Roundtrips
2+
3+
## READ BY
4+
- Any agent designing or modifying distance APIs in `src/hpc/{plane,vsa,distance,cascade,causal_diff,seal,merkle_tree,bnn,clam,fingerprint}.rs`
5+
- Any agent proposing a generic `fn distance<T>(a: &T, b: &T) -> f32` umbrella
6+
- W7 implementers (when W7 — cognitive bulk ops — moves from deferred to active)
7+
- The W3-W6 plan-review savant currently auditing the SoA/AoS design (this doc constrains W7's scope and bounds what the SoA helpers should NOT grow into)
8+
9+
## P0 TRIGGERS
10+
- About to design a generic `distance(a, b) -> f32` that picks the metric internally → STOP, distance is typed; build one fn per metric and name it
11+
- About to chain `palette 256 → fisher z → cosine → hamming → popcount → palette 256` in one call path → STOP, that's the canonical worst-case roundtrip and erases the typing
12+
- About to drop the `buckets` / Euler-gamma-offset arguments from a palette-256 distance call → STOP, those are PART of the typed distance, not optional context
13+
- About to silently convert between metric types in an intermediate step → STOP, conversions must be EXPLICIT (named fns) and documented at the call site as escape hatches
14+
15+
---
16+
17+
## The taxonomy
18+
19+
Each row is a distinct typed primitive. Mixing them at runtime is the bug this doc exists to prevent.
20+
21+
| Metric | Computes | Input | Output | Cascade role | Notes |
22+
|---|---|---|---|---|---|
23+
| **Palette 256 distance** | precomputed 256×256 table lookup with bucket + Euler-gamma-offset context | two `PaletteIdx` (u8) + `Buckets` + `EulerGammaOffset` | `PaletteDistance` (typed f32 newtype) | **Level 3** (finalist scoring, ~200 candidates) | The buckets and Euler offset are INTEGRAL to the metric. Dropping them changes the answer; passing zero offset is NOT the same as omitting. |
24+
| **HDR popcount early-exit** | Hamming on 256-bit bitpacked fingerprints with under-threshold short-circuit | two `&Fingerprint256` + `u16` threshold | `Option<HammingDistance>` (None = exceeded threshold, early exit) | **Level 1** (cosine REPLACEMENT, ~1M → ~20K) | This IS the cosine replacement on the cascade — NOT a derivative or approximation of cosine. The popcount metric directly substitutes for FP cosine in the search topology. |
25+
| **Base17 L1** | L1 (Manhattan) distance on 17-dim i16 vectors | two `&[i16; 17]` | i32 | **Level 2** (~20K → ~200) | Fits in one AVX-512 load or two NEON loads. The 17-dimension shape is specific; don't pad to 16 or 18. |
26+
| **Fisher-z transform** | variance-stabilizing transform of correlation → z-score | f32 correlation | f32 z-score | **NOT a distance** — a normalization applied to palette 256 OUTPUT when distance distributions need comparison across heterogeneous buckets | Calling Fisher-z on a non-correlation value is a category error. |
27+
| **BF16 mantissa exact transformation** | direct palette-256 → palette-256 mapping using BF16 mantissa context | `PaletteIdx` + `EulerGammaOffset` (+ mantissa context) | new `PaletteIdx` | **bypasses the cascade entirely** when the direct mapping is known | The fast path: when you already have a palette index and need the transformed palette index under a known offset, this is one typed hop in palette space. No metric translation, no cascade levels. |
28+
29+
---
30+
31+
## The roundtrip anti-pattern (worst case — do not write this code path)
32+
33+
```
34+
palette 256 distance [Level 3 typed]
35+
36+
37+
fisher z normalize [valid: variance-stabilize a Level-3 result]
38+
39+
40+
"treated as cosine" ←── BUG: popcount IS cosine replacement,
41+
│ fisher-z of palette ≠ cosine input
42+
43+
hamming distance [Level 1 typed — wrong scale, wrong topology]
44+
45+
46+
HDR popcount preheat [Level 1 detail — re-derived from wrong source]
47+
48+
49+
early exit [decisions made on un-rounded round-trip]
50+
51+
52+
palette 256 distance [back to start, BUT: buckets + Euler offset lost
53+
somewhere in the chain — answer differs from
54+
the original Level-3 result by quantization noise
55+
+ loss of bucket assignment]
56+
```
57+
58+
Each arrow:
59+
- pays an arithmetic / cache cost
60+
- loses the typed-distance identity (the type system stops protecting the call site)
61+
- introduces conversion error that compounds along the chain
62+
- can converge back to "approximately the same number" — which makes the bug invisible in unit tests but wrong in deployment
63+
64+
## The direct path (preferred)
65+
66+
```
67+
palette 256 distance ──[Euler gamma offset known]──▶ palette 256 BF16 mantissa exact transformation
68+
69+
70+
new PaletteIdx
71+
(stays in palette space)
72+
```
73+
74+
ONE typed step. Stays in palette-256 type space. Preserves bucket + offset throughout. No cascade traversal, no metric translation, no conversion noise.
75+
76+
When the BF16-mantissa direct path is applicable (caller has a `PaletteIdx` and an `EulerGammaOffset`), use it. The cascade exists for the case where you don't yet know which palette band the target lives in.
77+
78+
---
79+
80+
## API design rule (binding)
81+
82+
1. **One fn per metric. Named.**
83+
```rust
84+
pub fn palette256_distance(
85+
a: PaletteIdx, b: PaletteIdx,
86+
buckets: &Buckets, offset: EulerGammaOffset,
87+
) -> PaletteDistance;
88+
89+
pub fn hdr_popcount_early_exit(
90+
a: &Fingerprint256, b: &Fingerprint256, threshold: u16,
91+
) -> Option<HammingDistance>;
92+
93+
pub fn base17_l1(a: &[i16; 17], b: &[i16; 17]) -> i32;
94+
95+
pub fn palette256_bf16_mantissa_transform(
96+
p: PaletteIdx, offset: EulerGammaOffset, mantissa: BF16MantissaCtx,
97+
) -> PaletteIdx;
98+
```
99+
100+
2. **Conversions are explicit and named.** When a caller must cross metric boundaries (escape-hatch case):
101+
```rust
102+
pub fn hamming_distance_to_palette_index_estimate(d: HammingDistance) -> PaletteIdx;
103+
// ^-- name says "estimate" so callers can't mistake it for exact
104+
```
105+
The call site MUST carry a comment naming WHY the conversion is happening (not the default path).
106+
107+
3. **No `Box<dyn Distance>` / no `enum DistanceMetric { Palette, Hamming, Base17, … }` / no `fn distance<T: HasMetric>(a, b) -> f32` umbrella.**
108+
The type system distinguishes the metrics for a reason.
109+
110+
4. **Newtype the output.** `PaletteDistance(f32)`, `HammingDistance(u16)`, `Base17L1(i32)` — different output types prevent accidental cross-metric arithmetic. `let d = palette256_distance(...) + hamming_dist;` should not compile.
111+
112+
5. **The buckets and Euler-gamma-offset arguments to palette-256 fns are REQUIRED, not optional.** Default values for those parameters are domain-meaningful (changing them changes the metric); a `Default::default()` impl on `EulerGammaOffset` is acceptable only with strong documentation that the default is a deliberate calibration constant, not a sentinel.
113+
114+
---
115+
116+
## What this means for the SoA/AoS sprint (W3-W6) and beyond
117+
118+
### W3-W6 (currently in-flight on `claude/w3-w6-soa-aos-helpers`)
119+
- `SoaVec<T, N>`, `soa_struct!`, `aos_to_soa`, `soa_to_aos`, `bulk_apply`, `bulk_scan` are **generic over T**. They do NOT bake in any distance metric.
120+
- **DO NOT** during the sprint or in post-review extend any of these primitives toward distance computation. Distance stays out of the helper layer.
121+
- If a worker is tempted to add `fn bulk_distance<T>(...)` to `bulk.rs` → STOP, that's the umbrella anti-pattern.
122+
123+
### W7 (deferred — cognitive bulk ops)
124+
- When W7 lands, EACH metric gets its own bulk primitive named for the metric:
125+
```rust
126+
pub fn bulk_hdr_popcount_early_exit(
127+
query: &Fingerprint256, db: &[Fingerprint256], threshold: u16,
128+
) -> Vec<Option<HammingDistance>>;
129+
130+
pub fn bulk_palette256_distance(
131+
query: PaletteIdx, db: &[PaletteIdx],
132+
buckets: &Buckets, offset: EulerGammaOffset,
133+
) -> Vec<PaletteDistance>;
134+
135+
pub fn bulk_palette256_mantissa_transform(
136+
palettes: &[PaletteIdx], offset: EulerGammaOffset, mantissa: BF16MantissaCtx,
137+
) -> Vec<PaletteIdx>;
138+
```
139+
- Underneath, the bulk fns MAY use `SoaVec` / `bulk_apply` from W3/W4 for layout staging. That's fine — those are layout helpers, not distance helpers.
140+
- The cascade orchestrator in `hpc/cascade.rs` calls each Level's typed bulk primitive directly. It does NOT internally translate Level-1 outputs to Level-3 inputs by passing through Fisher-z.
141+
142+
### Bench harness (prereq for W7's SIMD acceleration)
143+
- Per-metric benches: one per typed primitive, no umbrella `bench_distance` macro.
144+
- The bench output should report the typed primitive's name in the column header so regressions are attributable to the specific metric.
145+
146+
---
147+
148+
## Cross-references
149+
150+
- `CLAUDE.md` § "Three-Level Cascade: How the Search Actually Works" — describes L1 Hamming sweep, L2 Base17 L1, L3 Palette lookup. The Levels are typed distance bands, NOT tiers of a generic distance abstraction.
151+
- `src/hpc/cascade.rs` — the orchestrator. Inspect before adding any new distance code; the call chain there is the canonical correct example.
152+
- `src/hpc/distance.rs` — distance utilities. Audit for any `fn distance<T>` umbrella that may have crept in; refactor to typed fns if found.
153+
- `src/hpc/plane.rs`, `src/hpc/vsa.rs` — produce values that feed the cascade. Their output types must be the typed distance newtypes, not raw `f32`.
154+
- `.claude/knowledge/w3-w6-soa-aos-design.md` — the W3-W6 helper design (this doc constrains what those helpers must NOT grow into).
155+
- `.claude/knowledge/vertical-simd-consumer-contract.md` — the layering rule (user → crate::simd → simd_{type}). Same family: layering and typing are both about preserving identity across abstraction levels.
156+
157+
---
158+
159+
## TL;DR for an agent reading this in 30 seconds
160+
161+
1. Palette-256 distance ≠ Hamming popcount ≠ Base17 L1. Don't put them under one API.
162+
2. Palette-256 distance carries `buckets + EulerGammaOffset` always. Don't drop them.
163+
3. The fast path inside palette space is the BF16-mantissa direct transform — one hop, no cascade.
164+
4. The cascade is THREE typed levels in sequence, NOT a generic distance pipeline with intermediate conversions.
165+
5. Conversions between metric types must be explicit, named, documented per call site.
166+
6. No `Box<dyn Distance>`, no umbrella `fn distance<T>(...)`, no `enum DistanceMetric`.
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# W3-W6 Codex Audit
2+
3+
Auditor: codex P0 review agent
4+
Branch: `claude/w3-w6-soa-aos-helpers` (rebased onto master at `3f35170d`)
5+
Commits under review:
6+
- `5095853c` feat(hpc/soa): `SoaVec` + `soa_struct!` + `aos_to_soa` + `soa_to_aos` (W3+W5+W6)
7+
- `5845ab2d` feat(hpc/bulk): `bulk_apply` + `bulk_scan` (W4)
8+
9+
## Verdict
10+
11+
**READY-FOR-PR.** Zero P0 findings. One P1 cosmetic docstring gap (patched on this branch). Three P2 items intentionally deferred per the design contract.
12+
13+
## Verification exit codes (all 0)
14+
15+
| Command | Exit | Notes |
16+
|---|---|---|
17+
| `cargo check -p ndarray --no-default-features --features std` | 0 | |
18+
| `cargo test -p ndarray --lib --no-default-features --features std hpc::soa` | 0 | 29 passed, 0 failed |
19+
| `cargo test -p ndarray --lib --no-default-features --features std hpc::bulk` | 0 | 16 passed, 0 failed |
20+
| `cargo test --doc -p ndarray --no-default-features --features std hpc::soa` | 0 | 10 passed, 1 intentionally ignored |
21+
| `cargo test --doc -p ndarray --no-default-features --features std hpc::bulk` | 0 | 2 passed, 1 intentionally ignored |
22+
| `cargo fmt --all -- --check` | 0 | |
23+
| `cargo clippy -p ndarray --no-default-features --features std -- -D warnings` | 0 | |
24+
25+
## P0 findings
26+
27+
None.
28+
29+
## P1 findings
30+
31+
- **F4**`usize::MAX` chunk-size behavior is tested at `src/hpc/bulk.rs:182-194` but NOT documented in the public docstring of `bulk_apply` (`src/hpc/bulk.rs:46-66`) or `bulk_scan` (`src/hpc/bulk.rs:80-97`). One-line addition: "`chunk_size == usize::MAX` yields the entire slice as a single chunk." **Patched on this branch** before PR opens.
32+
33+
## P2 findings (deferred per design contract)
34+
35+
- **G1** (`bulk_scan` naming): the savant flagged that "scan" conventionally means fold-with-state. Kept as `bulk_scan` for symmetry with `bulk_apply`. Rename to `bulk_for_each` / `bulk_inspect` is a follow-up if downstream consumers find the name misleading.
36+
- **G2** (`SoaVec::iter_rows`): row iterator yielding `[&T; N]` per row is absent. Use `soa.chunks(1)` for the same effect. Deferred to a follow-up once a real use case exists.
37+
- **G3** (`SoaVec` lacks `#[derive(Clone, Debug)]`): macro-generated structs DO support derive passthrough (verified by test at `src/hpc/soa.rs:733-742`), but the generic container does not. Deliberately deferred — adding derives would require `where T: Clone + Debug` bounds that callers don't always want.
38+
39+
## D4 — integration test gate
40+
41+
The `bulk_apply` × `aos_to_soa` integration test at `src/hpc/bulk.rs:295` correctly uses `#[cfg(any())]` (canonical never-compile sentinel). The test body at `src/hpc/bulk.rs:297-324` is sound. Now that `hpc/soa.rs` and `hpc/bulk.rs` are landing in the same PR, the gate could be removed as a follow-up — but worker B's `cfg(any())` gate preserves the safe deferral if the PR review wants to keep them independently mergeable.
42+
43+
## Compliance summary
44+
45+
| Concern | Status | Notes |
46+
|---|---|---|
47+
| Layering rule (no `#[target_feature]`, no per-arch imports, no raw intrinsics) | ✅ clean | Only doc-prose mentions in module headers; zero actual attributes |
48+
| Distance typing (no umbrella `fn distance<T>`, no `enum DistanceMetric`, no `Box<dyn Distance>`) | ✅ clean | Both module headers cite `cognitive-distance-typing.md` and warn against extension toward distance |
49+
| Spec API match (per design doc v2 §C1-C7, §D1-D4) | ✅ exact | `field_n::<const I>()` uses `const { assert!(I < N) }`; all method signatures verbatim |
50+
| Doc coverage (every `pub fn` has `///` doc with working `# Example`) | ✅ complete | After F4 patch |
51+
| Test coverage (per design doc §"Tests" per fn) | ✅ complete | 29 + 16 = 45 unit tests + 12 doctests |
52+
53+
## Recommended next step
54+
55+
Open the W3-W6 PR. F4 fix landed on the same branch as a follow-up commit; integration test stays `cfg(any())`-gated for the PR; P2 items deferred.

0 commit comments

Comments
 (0)