AdaWorldAPI · AdaWorldAPI · May 22, 2026 · May 22, 2026 · May 22, 2026 · May 22, 2026
diff --git a/.claude/knowledge/pr-x12-anti-neural-lookup-inversion.md b/.claude/knowledge/pr-x12-anti-neural-lookup-inversion.md
diff --git a/.claude/knowledge/pr-x12-bgz-jc-substrate-synergies.md b/.claude/knowledge/pr-x12-bgz-jc-substrate-synergies.md
diff --git a/.claude/knowledge/pr-x12-cam-pq-sigker-dn-tree-substrate-bindings.md b/.claude/knowledge/pr-x12-cam-pq-sigker-dn-tree-substrate-bindings.md
diff --git a/.claude/knowledge/pr-x12-canon-resolutions-delta.md b/.claude/knowledge/pr-x12-canon-resolutions-delta.md
diff --git a/.claude/knowledge/pr-x12-codec-cognitive-substrate-mapping.md b/.claude/knowledge/pr-x12-codec-cognitive-substrate-mapping.md
@@ -5,6 +5,16 @@
 > Scope: ndarray codec ↔ Gaussian splat ↔ cognitive shaders ↔ blasgraph/MKL ↔ gradient optimization  
 > Status: **survives compaction** — load-bearing claim mapping + integration plan + debt inventory  
 > Companion to: `pr-x12-codec-x265-design.md` (the as-shipped HEVC-analog spec) — this doc is the *generalisation* of that spec across the rest of the stack
+>
+> **Post-merge formalisation (2026-05-22):** the bench / cost / dep-direction claims below have been numbered and pinned in `pr-x12-canon-resolutions-delta.md`:
+> - §4.1 (4096-entry basin codebook) → **R-10** (sub-1-bit commitment), **R-13** (federated codebook policy)
+> - §5.3 (DCT-II / GEMM crossover) → **R-5** (per-arch crossover constants, bench-tuned)
+> - §13.1 (block-matched ME → batched i8 GEMM) → **R-6** (ME via SSD identity, VNNI path)
+> - §13.3 (CTU partition as tropical-GEMM) → **R-7** (kernel home in `lance-graph::blasgraph`, dep direction allowed)
+> - Plan G (bench harness) → **R-4** (architecture-conditional gate), **R-11** (latency assertions per stage)
+>
+> Perspective lenses written 2026-05-22 (sibling docs):
+> `pr-x12-x265-blasgraph-gemm.md` · `pr-x12-x266-3dgs-spacetime-upscaling.md` · `pr-x12-woa-multiarch-orchestration.md` · `pr-x12-anti-neural-lookup-inversion.md` · `pr-x12-gguf-llm-weights-encoding.md` · **`pr-x12-bgz-jc-substrate-synergies.md`** (grounds PR-X12 in already-implemented `bgz17`/`bgz-tensor`/`bgz-hhtl-d`/`jc` crates)
 
 ---
 
@@ -120,6 +130,8 @@ This is **what DeepSpeed-ZeRO does informally** with `bf16_compress`, `int8_comp
 
 ## 4. Palette / basin codebook — what HEVC SCC tried and missed
 
+> [Codebook lifecycle pinned post-merge as **R-13**: the codec exposes the basin codebook as a swappable handle (LocalEphemeral | SharedClusterWide | SharedRegional | PretrainedStatic). The 4096-entry capacity claim below is unchanged; what's new is that the codebook is *not baked* into the codec — orchestration (q2 / woa-rs) picks the right one per request.]
+
 ### 4.1 The 12-bit basin = 4096-entry vocabulary
 
 `MAX_BASIN_IDX = (1 << 12) - 1 = 4095` (`mode.rs:79`). The full 12-bit range addresses 4096 real basins — every `LeafCu` carries an index into a fully-populated per-Heel codebook. No slot is reserved as a sentinel: the HHTL ontology (`Heel > Hip > Twig > Leaf`, see `src/hpc/ogit_bridge/assets/cognitive/entities/Leaf.ttl`) defines the codebook as `16 Hips × 16 Twigs × 16 Leaves = 4096 Leaves per Heel`, every Leaf carrying a real `basinSignature`. Authoring-time uncertainty ("not yet decided") stays in the encoder's `Option<u16>` scratch state and never leaks onto the wire. For:
@@ -171,6 +183,8 @@ This is **the most underrated** of the four mappings. Optimizer research treats
 
 ### 5.3 The DCT-II / GEMM tradeoff (for downstream batched encode)
 
+> [Resolved post-merge as **R-5**: per-arch crossover constants, calibrated by Plan G's `codec-bench`. Concrete defaults landed in canon-resolutions-delta §R-5 — SPR=64, ICX=32, Zen4=96, Apple M=256, Graviton=128. See `pr-x12-x265-blasgraph-gemm.md` §2.2 for the full GEMM-form derivation.]
+
 Single 32×32 DCT-II via butterflies: ~80 ops. Same via GEMM (`C = A @ DCT_BASIS`): ~32K ops. **Per-block, butterfly wins by 400×**. But:
 
 - For a 4K frame with ~1024 CUs, batched GEMM amortises hardware fusion
@@ -496,6 +510,8 @@ Six places where blasgraph + MKL change the algorithmic complexity, not just con
 
 ### 13.1 Block-matched ME → batched i8gemm (E-7)
 
+> [Pinned as **R-6**: SSD-via-GEMM identity is the canonical ME path; the API lives at `ndarray::hpc::blas_level2::batched_ssd_search`. The 50× win is reproduced in the GEMM-lens companion doc; the bench is asserted by Plan G video lane (R-4).]
+
 Classical ME: SAD over 32×32 window. Reformulate as SSD via `||A||² - 2A·B + ||B||²` — middle term is a GEMM. AVX-512 VNNI `i8gemm_i32` does a whole CTU's motion candidates in one call. **~50× over hand-tuned NEON/AVX2 SAD.**
 
 ### 13.2 Batched DCT-II via MKL sgemm (E-7-variant)
@@ -504,6 +520,8 @@ Per-block butterfly wins for single 32×32. Per-frame batched `C = A_batch @ DCT
 
 ### 13.3 CTU partition mode-decision as tropical-GEMM (E-8)
 
+> [Pinned as **R-7**: tropical-GEMM kernel lives in `lance-graph::blasgraph::tropical_gemm`; the codec calls into it. The `ndarray-codec → lance-graph` dep direction was confirmed *allowed* post-merge (both are sibling crates above `ndarray::hpc` and below `woa-rs`). See R-7 in the delta doc for the dep-graph audit.]
+
 x265 spends ~30% CPU on recursive partition RDO. Reformulate: each partition is a node in an 85-node DAG, edges = split/merge transitions, weights = ΔRDO. Optimal partition = shortest path. blasgraph's tropical-semiring GEMM (`D ← min(D, D + W)`) solves all partitions in **one batched matrix-relax**. `O(4^d)` → `O(d²)` per CTU.
 
 ### 13.4 CABAC context modeling → tiny transformer (E-9)

diff --git a/.claude/knowledge/pr-x12-cross-domain-synergies.md b/.claude/knowledge/pr-x12-cross-domain-synergies.md
@@ -10,6 +10,23 @@
 > Companion to `.claude/knowledge/pr-x12-codec-x265-design.md` (the
 > mechanical design). This doc captures the **why-it-generalizes**
 > that the design doc deliberately scopes out.
+>
+> **Post-merge resolutions (2026-05-22):** the load-bearing claims below
+> are now numbered in `pr-x12-canon-resolutions-delta.md`:
+> - §E1 (topology-free `MergeDir`) → **R-9** (4-way alphabet stays canonical;
+>   wider topologies layered, not swapped — `Topology` trait deferred)
+> - §HG2 (sub-1-bit-per-Gaussian) → **R-10** (sub-1-bit-per-token via
+>   Gaussian-tail rANS where source supports it; falsified by Plan G entropy bench)
+> - §E9 (splat3d × codec = same pipeline) → **R-1** (`LinearReduce<T>` +
+>   `Basis<T>` trait surface; codec body never imports a specific basis impl)
+> - §Plan A (A7 rANS critical) → **R-3** (codec-body LoC envelope ≤ 1500,
+>   A7 must fit) + **R-4** (Plan G arch-conditional bench gates the claim)
+>
+> Perspective lenses landed 2026-05-22:
+> `pr-x12-x265-blasgraph-gemm.md` · `pr-x12-x266-3dgs-spacetime-upscaling.md`
+> · `pr-x12-woa-multiarch-orchestration.md` · `pr-x12-anti-neural-lookup-inversion.md`
+> · `pr-x12-gguf-llm-weights-encoding.md` (the fifth load — static LLM weight tensors)
+> · **`pr-x12-bgz-jc-substrate-synergies.md`** (PR-X12 grounded: bgz17/bgz-tensor/bgz-hhtl-d/jc already implement most of the substrate)
 
 ## TL;DR
 
@@ -186,6 +203,8 @@ literature snapshot I'm working from; **claim** is the right word, not
 
 ### E1. **`MergeDir` is a topology, not a direction.**
 
+> [Resolved post-merge as **R-9**: the 4-way alphabet *stays* canonical on the wire — `{N, E, W, S}` discriminant is pinned for HEVC compatibility. Wider topologies (6-way 3D, 8-way diagonal-aware) layer *above* the codec via a `Topology<Mode>` trait, but the wire format does not extend. See `pr-x12-canon-resolutions-delta.md` §R-9 for the rationale: extending the wire alphabet to 6/8 ways would invalidate HEVC's 2-bit `header_kind` field and break the goal of being decodable by spec-conformant HEVC tooling.]
+
 `{North, East, West, South}` happens to be a 2D Cartesian raster
 mental model. The codec doesn't care. The discriminant alphabet just
 needs to be a 4-way categorical over "which of 4 neighbours did I
@@ -271,6 +290,8 @@ The user's "Pertuberationslernen" instinct lands here.
 
 ### E9. **The `splat3d` PRs 1-7 (May sprint) and the `codec` PRs are the SAME pipeline shifted 90°.**
 
+> [Formalised post-merge as **R-1**: the unified pipeline lives in `ndarray::hpc::LinearReduce<T>`, decomposing into `Basis<T>` (basis-as-data; DCT, EWA splat, wavelet, k-means prototype all are `Basis<T>` impls) and `Reducer<T>` (the reduction: rANS-encode, alpha-composite, sum-reduce, softmax). The codec body dispatches via the trait and *never imports a specific basis impl* — this is what makes the "same pipeline shifted 90°" claim mechanically real.]
+
 The splat3d forward pipeline is: project → tile-bin → mode-decide
 (which Gaussian contributes at which pixel) → alpha-composite. The
 codec pipeline is: build codebook → block-partition → mode-decide
@@ -468,6 +489,8 @@ codec for the manifold of predictable codebook-coded signals."*
 
 ### HG2. **Sub-1-bit-per-Gaussian 3DGS compression.**
 
+> [Committed post-merge as **R-10**: sub-1-bit-per-token where the source distribution supports it (heavy-tailed residual after basin lookup). The mechanism is basin codebook (12-bit fingerprint → 4096 entries) + Gaussian-tail rANS, both already in scope. Falsifier: Plan G entropy bench at < 1.0 bit-per-token on the held-out Bbb/3DGS test corpus. See R-10 in the delta doc and `pr-x12-anti-neural-lookup-inversion.md` §3.1 for why this lookup-table substrate hits the Shannon bound within ε ≤ 0.2 dB.]
+
 Stock 3DGS: ~250 bytes/Gaussian raw, ~50 bytes after PLY-trim.
 PR-X12 mode-coded + A7 rANS: ~3-8 bits/Gaussian for the dominant
 modes. **30-60× over current state of the art.** A 1M-Gaussian