From 230bf164bf58695ef550970a88a6a23016f01362 Mon Sep 17 00:00:00 2001
From: Lior Cohen <lior1cc@gmail.com>
Date: Fri, 10 Apr 2026 05:20:21 +0300
Subject: [PATCH 1/3] KS78: Sync docs -- ROADMAP v0.7.5, CHANGELOG, MCP tool
 count

- Rewrite ROADMAP.md from stale v0.5.0 to current v0.7.0 state
- Add CHANGELOG [0.7.5] section covering KS67-KS77 changes
- Fix MCP tool count: 9 -> 12 in CONTRIBUTING.md, CHANGELOG.md, ARCHITECTURE.md
- Update SECURITY.md supported version: 0.5.x -> 0.7.x
- List all 12 MCP tools where tool names are enumerated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 CHANGELOG.md         |  36 ++++-
 CONTRIBUTING.md      |   2 +-
 SECURITY.md          |   4 +-
 docs/ARCHITECTURE.md |   2 +-
 docs/ROADMAP.md      | 356 ++++++++++++-------------------------------
 5 files changed, 138 insertions(+), 262 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index c4647fa..284b379 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,38 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/).
 
 ## [Unreleased]
 
+## [0.7.5] -- 2026-04-10
+
+### Added
+- **Schema-driven fact extraction** (KS67): structured extraction pipeline replacing free-form LLM output
+- **Entity unification** (KS73): EntityFrame, EntityId resolution, alias tracking, supersession rewrite
+- **Configurable embedding** (KS75): EmbeddingProvider trait, 10 fastembed models, OpenAI API support
+- **Universal prompt** (KS76): single consolidation prompt for all reader models (no per-model tuning)
+- **Temporal boost** (KS76): recency-weighted scoring for time-sensitive queries
+- **Importance scoring** (KS76): 5-signal importance scoring (entity density, temporal salience, novelty, info density, user signal)
+- **Design system foundation** (KS77): design tokens, component spec for viz app
+- **Negative recall benchmark**: 3/3 baseline for "I don't know" scenarios
+- **Abstention benchmark**: 5/5 -- engine correctly abstains when no relevant memory exists
+
+### Changed
+- **Consolidation redesign** (KS69): child memory pipeline rewrite with quality gates, dedup, soft invalidation
+- **Consolidation Tier 2** (KS71): subject fix, quality gate, dedup, soft invalidation
+- **Child keyword labels** (KS72): labels assigned at child creation time
+- **Default enrichment model**: switched to `qwen2.5:1.5b`
+- **MCP server**: now exposes 12 tools (was 9) -- added `memory_graph`, `memory_related`, `memory_get`
+
+### Fixed
+- **KU-3 recall** (KS77): knowledge update scenario now passes in seeded benchmark
+- **IE-3, TR-3, ME-4, PT-3 recall** (KS68): multiple recall fixes across LME categories
+- **Temporal label dedup trap** (KS77): avoid adding temporal labels to children when parent has temporal content
+- **Persistence format version**: format mismatch fix for MCP store/echo
+
+### Performance
+- Seeded micro-benchmark: 19/20 (up from 55% baseline)
+- Abstention: 5/5
+- Negative recall: 3/3
+- LME-S baseline (GPT-4o judge): 24.2% overall
+
 ## [0.7.0] — 2026-04-02
 
 ### Added
@@ -153,8 +185,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/).
 - Competitive scan update (MuninnDB, memU, Hindsight, NeuralMemory)
 
 ### Added — KS7: MCP Server
-- **MCP Server** (`shrimpk-mcp`): 9 tools over JSON-RPC 2.0 stdio
-  - store, echo, stats, forget, dump, config_show, config_set, persist, status
+- **MCP Server** (`shrimpk-mcp`): 12 tools over JSON-RPC 2.0 stdio
+  - store, echo, memory_graph, memory_related, memory_get, stats, forget, dump, config_show, config_set, persist, status
   - Lazy engine init (fastembed loads on first tool call, not on handshake)
   - Auto-persist after store/forget, stdout sacred (logs to stderr)
   - Registered globally via `claude mcp add --scope user`
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 4d61ee5..7c3c8dd 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -30,7 +30,7 @@ Unit tests run entirely in-memory and complete in seconds. Integration tests dow
 | `shrimpk-security` | Sandbox, permissions | Planned (stub) |
 | `shrimpk-kernel` | Integration facade | Stable |
 | `shrimpk-python` | PyO3 bindings | Exists (untested in CI) |
-| `shrimpk-mcp` | MCP server (9 tools) | Stable |
+| `shrimpk-mcp` | MCP server (12 tools) | Stable |
 | `shrimpk-daemon` | HTTP daemon + proxy | Stable |
 | `shrimpk-tray` | System tray app | Stable |
 
diff --git a/SECURITY.md b/SECURITY.md
index 402e1a4..437fc8f 100644
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -4,8 +4,8 @@
 
 | Version | Supported |
 |---------|-----------|
-| 0.5.x (latest) | Yes |
-| < 0.5.0 | No |
+| 0.7.x (latest) | Yes |
+| < 0.7.0 | No |
 
 Only the latest released version receives security fixes. If you are running an older version, please upgrade before reporting.
 
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
index c283c75..8b4ffee 100644
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -662,7 +662,7 @@ Integration layer that wires together `shrimpk-memory`, `shrimpk-context`, and `
 
 ### shrimpk-mcp
 
-Model Context Protocol server. Exposes Echo Memory as MCP tools (`store`, `echo`, `stats`, `forget`, `status`, `config_show`, `dump`) via JSON-RPC 2.0 over stdio. Compatible with any MCP-aware AI client.
+Model Context Protocol server. Exposes Echo Memory as 12 MCP tools (`store`, `echo`, `memory_graph`, `memory_related`, `memory_get`, `stats`, `forget`, `status`, `config_show`, `config_set`, `dump`, `persist`) via JSON-RPC 2.0 over stdio. Compatible with any MCP-aware AI client.
 
 Key design: the `EchoEngine` is lazily initialized on first tool call. The server starts in milliseconds; fastembed model loading (a few seconds) is deferred until a memory operation is actually requested.
 
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
index 0468bf2..9aa8736 100644
--- a/docs/ROADMAP.md
+++ b/docs/ROADMAP.md
@@ -1,331 +1,175 @@
 # ShrimPK Roadmap
 
-This roadmap reflects the current state of the kernel and planned directions for future releases.
-Dates are aspirational. Contributions are welcome at any stage — see the Contribution Opportunities
-section for specific items you can pick up today.
+This roadmap reflects the current state of the kernel and planned directions.
+Dates are aspirational. Contributions welcome -- see Contribution Opportunities below.
 
 ---
 
-## Current State — v0.5.0
+## Current State -- v0.7.0
 
-Released March 2026. The core pipeline is stable and benchmarked.
+Released April 2026. The core pipeline is stable with 11 crates + CLI, hybrid GraphRAG retrieval, and entity unification.
 
 ### What is shipped and working
 
-**Echo pipeline**
+**Echo pipeline (hybrid GraphRAG)**
 
-The full retrieval chain is operational: Bloom filter pre-screening (O(1) topic elimination),
-LSH candidate retrieval (sub-linear at scale), cosine reranking, Hebbian co-activation boosting,
-and recency decay. Optional HyDE (hypothetical document expansion) and LLM reranking are
-available via config flags.
+Full retrieval chain: Bloom filter pre-screening, LSH candidate retrieval, cosine reranking,
+Hebbian co-activation boosting, FSRS decay, ACT-R activation, label-based pre-filtering,
+schema-driven fact extraction, entity unification with supersession, and temporal boosting.
+Optional HyDE and LLM reranking via config flags.
 
-**Text memory — BGE-small-EN-v1.5**
+**Configurable embedding**
 
-Primary embedding model: `BAAI/bge-small-en-v1.5` via fastembed. The pipeline achieves 84%
-top-3 recall (combined HyDE + LLM reranker config) on a realistic 41-memory, 25-query benchmark
-spanning five LongMemEval categories: information extraction, multi-session reasoning, temporal
-reasoning, knowledge update, and preference tracking. Temporal queries hit 100% (5/5) across
-all pipeline configs.
+EmbeddingProvider trait with 10 fastembed models (BGE-small-EN-v1.5 default, 384-dim) and
+OpenAI API support. Runtime-switchable via config without restart.
 
-**Vision memory — CLIP ViT-B/32**
+**Multimodal SHRM v2 format**
 
-Image memories are embedded using CLIP ViT-B/32 (512-dim) via fastembed's `ClipVitB32` variant.
-Cross-modal retrieval (text queries retrieving image memories) works in the same embedding space.
-The vision feature is gated behind `--features vision`.
+Memory-mapped binary format with 3 channels: text (384-dim), vision (512-dim), speech (640-dim).
+32-bit CRC per entry, atomic flush, crash recovery. Per-channel LSH indices.
 
-**Sleep consolidation**
-
-A background consolidation pass runs during idle periods (configurable schedule). It uses a local
-LLM via Ollama to extract atomic facts from raw memories, de-duplicate, and merge related entries.
-In benchmarks, consolidation lifted top-3 recall from 72% to 76% over the baseline (no
-consolidation) configuration.
-
-**SHRM v2 storage format**
-
-Memory-mapped binary format with 32-bit CRC per entry, atomic flush, and crash recovery. Stores
-text embeddings (384-dim), optional vision embeddings (512-dim), optional speech embeddings
-(640-dim field, populated from v0.6.0 onward), metadata, and sensitivity labels.
-
-**Speech architecture (structure only)**
-
-`shrimpk-memory/src/speech.rs` defines the full `SpeechEmbedder` struct with dimension constants
-(`SPEAKER_DIM=256`, `PROSODY_DIM=384`, `SPEECH_DIM=640`), Whisper log-Mel preprocessing, and
-ONNX sessions wired in v0.6.0. The 16 kHz resampler uses linear interpolation.
-
-**MCP server**
-
-`shrimpk-mcp` exposes nine tools over stdio: `store`, `echo`, `forget`, `stats`, `status`,
-`config_show`, `config_set`, `dump`, `persist` (plus `store_image` and `store_audio` when
-multimodal features are enabled). Compatible with Claude Desktop and any MCP client.
-
-**Daemon + tray**
-
-`shrimpk-daemon` runs as a background HTTP service on `localhost:11435`. `shrimpk-tray` provides
-a system tray icon and launch/stop controls on Windows.
-
-**Performance (release build, i7-1165G7)**
-
-| Metric | Result |
-|--------|--------|
-| P50 echo latency at 10K memories | 3.50ms |
-| P50 echo latency at 100K memories | 23.79ms (regression — see Known Issues) |
-| Store throughput | ~128 memories/sec |
-| RAM (10K text memories) | ~85 MB |
-
----
-
-## v0.6.0 — Speech and Vision Upgrade
+**Speech pipeline (640-dim)**
 
-Target: Q2 2026. Focus: wire the speech ONNX models and upgrade the vision model.
+ECAPA-TDNN (256-dim speaker) + Whisper-tiny encoder (384-dim prosody). ONNX inference via ort,
+auto-download from HuggingFace Hub. Silero VAD gating. Feature-gated behind `--features speech`.
 
-### Speech: ONNX models wired (640-dim — DONE in KS51)
+**Vision pipeline (512-dim)**
 
-The speech pipeline is **640-dim** (ECAPA-TDNN 256 + Whisper-tiny encoder 384). The emotion
-channel (Wav2Small, CC-BY-NC-SA-4.0) was dropped as license-incompatible. Both wired models
-carry permissive licenses: ECAPA-TDNN (Apache-2.0) and Whisper-tiny (MIT).
+CLIP ViT-B/32 via fastembed. Cross-modal text-to-image retrieval. Feature-gated behind `--features vision`.
 
-#### ECAPA-TDNN 256-dim — speaker identification
+**Entity unification**
 
-Model: `Wespeaker/wespeaker-cnceleb-resnet34-LM` (`cnceleb_resnet34_LM.onnx`, ~24 MB,
-Apache 2.0). Loaded via `ort` (ONNX Runtime Rust crate). Auto-downloads from HuggingFace Hub.
+EntityFrame with EntityId resolution, alias tracking, supersession rewrite. Entities unify
+across memories for consistent knowledge updates.
 
-Input: 80-bin FBank features, shape `(1, frames, 80)`, 25ms frame, 10ms hop, 16 kHz.
-Output: 256-dim L2-normalized speaker embedding (output name: `embs`).
-
-#### Whisper-tiny encoder 384-dim — prosody
-
-Model: `onnx-community/whisper-tiny` (`onnx/encoder_model.onnx`, 32.9 MB, MIT). The encoder
-takes 80-bin Whisper log-Mel spectrogram, shape `(batch, 80, 3000)`, padded to 30 seconds.
-Mean-pooling over the sequence dimension produces a 384-dim prosody vector.
-
-#### Spectrogram preprocessing
-
-Two spectrogram pipelines run in parallel:
-
-- **Kaldi fbank** for ECAPA-TDNN: 80 Mel bins, 25ms frame, 10ms hop, 16 kHz. Implementation via
-  the `mel-spec` crate (v0.3.4, MIT).
-- **Whisper log-Mel** for the encoder: 80 Mel bins, N_FFT=400, hop=160 samples, normalized as
-  `(log_spec + 4.0) / 4.0`. Also handled by `mel-spec`.
-
-#### Band-limited resampling
+**Sleep consolidation**
 
-The current `resample_linear()` stub in `speech.rs` introduces aliasing at high downsample ratios
-(e.g., 48 kHz → 16 kHz). v0.6.0 replaces it with the `rubato` crate (v1.0.1), which provides
-sinc-interpolation and FFT-based resamplers that are alias-free.
+Background LLM-driven fact extraction via Ollama. Schema-driven extraction with quality gates,
+dedup, soft invalidation. Universal prompt works across all reader models.
 
-#### VAD gate — Silero VAD
+**Importance scoring**
 
-A Voice Activity Detection pass runs before the ECAPA and Whisper sessions. Silent frames
-(below a configurable threshold) are skipped entirely to avoid embedding noise as speech.
-Silero VAD is loaded as a small ONNX model (~2 MB, MIT license) via a direct `ort::Session`.
-The `silero-vad` crate on crates.io is GPL-2.0 and is explicitly avoided — the ONNX model
-is loaded directly.
+5-signal importance scoring: entity density, temporal salience, novelty, information density,
+and user-signal weighting.
 
-#### ort version pinning
+**MCP server**
 
-fastembed v5.x pins `ort = "=2.0.0-rc.11"`. The speech code must use the exact same version
-to avoid Cargo dependency conflicts. Do not add `ort` as a direct workspace dependency with a
-different version specifier.
+`shrimpk-mcp` exposes 12 tools over stdio: `store`, `echo`, `memory_graph`, `memory_related`,
+`memory_get`, `stats`, `forget`, `status`, `config_show`, `config_set`, `dump`, `persist`.
+Compatible with Claude Desktop and any MCP client.
 
-#### Model download on first use
+**Daemon + proxy**
 
-Models are downloaded on first `SpeechEmbedder::from_config()` call if not already cached,
-following the fastembed pattern: `hf-hub` crate + `dirs::cache_dir()/shrimpk/models/speech/`.
-Total first-use download: ~60 MB (ECAPA 25 MB + Whisper encoder 33 MB + Silero VAD 2 MB).
+`shrimpk-daemon` on `localhost:11435`. OpenAI-compatible proxy (`/v1/chat/completions`) with
+transparent memory injection. Health, debug, and stats endpoints.
 
-### Vision: CLIP ViT-B/32 → Nomic Embed Vision v1.5 (512 → 768-dim)
+**System tray**
 
-`NomicEmbedVisionV15` is already a first-class variant in fastembed v5 (`ImageEmbeddingModel`
-enum). The swap is a single-line change in `embedder.rs`. The quality improvement is substantial:
-+7.8 percentage points on ImageNet zero-shot (71.0% vs 63.2%) and dramatically better cross-modal
-MTEB quality (62.28 vs 43.82 for the paired text model). The q4-quantized ONNX is 62 MB vs
-CLIP's unquantized 352 MB — a 6x size reduction.
+`shrimpk-tray` provides Windows system tray controls.
 
-The 512 → 768 dimension change is a **breaking migration** for stored vision embeddings. The
-SHRM v2 format header records embedding dimensions per modality. On first launch after upgrade,
-the kernel will detect the dimension mismatch, re-embed all stored vision memories, and rewrite
-the store. For the v0.5.0 → v0.6.0 transition the user base is small and a hard-cut re-embed
-is the correct strategy. A migration guide will be included in the release notes.
+**CLI**
 
-Cross-modal text queries against vision memories must use Nomic Text v1.5 with the mandatory
-`search_query:` prefix. This is handled internally by the embedder — callers do not need to
-add the prefix manually.
+`store`, `echo`, `status`, `explore` (ratatui TUI), `detect`, `dump`, `bench`, `config`.
 
-### Fix: 100K latency regression
+### Benchmarks
 
-The P50 latency at 100K memories is 23.79ms against a 4.0ms target. Investigation is required
-before v0.6.0 ships. See Known Issues for details.
+| Metric | Result |
+|--------|--------|
+| Seeded micro-benchmark | 19/20 |
+| Abstention | 5/5 |
+| Negative recall | 3/3 |
+| LME-S baseline (GPT-4o judge) | 24.2% overall, 25.3% task-avg |
+| P50 echo latency (10K) | 3.50ms |
+| Test count | ~481 |
+
+### Workspace (11 crates + CLI)
+
+| Crate | Purpose |
+|-------|---------|
+| `shrimpk-core` | Types: MemoryEntry, EchoResult, EchoConfig, Modality |
+| `shrimpk-memory` | Engine: EchoEngine, embedding, LSH, Bloom, Hebbian, labels, FSRS, ACT-R |
+| `shrimpk-daemon` | HTTP server: axum, proxy, routes |
+| `shrimpk-mcp` | MCP server (stdio): 12 tools |
+| `shrimpk-context` | ContextAssembler: token-budgeted prompt compilation |
+| `shrimpk-router` | CascadeRouter: provider routing |
+| `shrimpk-security` | PII masking (6 categories, 14 regex patterns) |
+| `shrimpk-kernel` | Facade crate re-exporting core + memory + context |
+| `shrimpk-python` | PyO3 bindings (maturin) |
+| `shrimpk-ros2` | ROS2 bridge (stub) |
+| `shrimpk-tray` | Windows system tray (win32) |
+| `cli/` | CLI binary |
 
 ---
 
-## v0.7.0 — Robotics, Speaker Upgrade, and Quantization
+## Upcoming
 
-Target: Q3 2026. Focus: ROS2 integration, model quality improvements, and memory footprint.
+### KS78 -- Critical Fixes (April 2026)
 
-### ROS2 bridge — `shrimpk-ros2` crate
+- Persistence format version mismatch fix (Issue #16)
+- Documentation sync (ROADMAP, CHANGELOG, MCP tool count)
+- Design system v2 implementation
 
-A new workspace crate `crates/shrimpk-ros2` will provide a ROS2 node that exposes ShrimPK
-memory over standard ROS2 topics and services.
+### KS79 -- Multi-Resolution Retrieval
 
-The node subscribes to:
-- `/shrimpk/store/text` (`std_msgs/String`) — text memories
-- `/shrimpk/store/image` (`sensor_msgs/CompressedImage`) — visual memories via CLIP
-- `/shrimpk/store/audio` (`audio_common_msgs/AudioStamped`) — speech memories
+- Hierarchical retrieval across raw memories, extracted facts, and entity summaries
+- Adaptive context window based on query complexity
 
-The node publishes to:
-- `/shrimpk/echo` (`shrimpk_msgs/EchoResults`) — push-activated memories
-- `/shrimpk/context` (`std_msgs/String`, latched) — current context string for downstream LLMs
-- `/shrimpk/status` (`std_msgs/String`, JSON) — health and latency stats
+### KS80 -- Memory Lifecycle Improvements
 
-A `/shrimpk/query` service (`shrimpk_msgs/EchoQuery`) supports pull-based querying for nodes
-that prefer request/response semantics over the push model.
+- Smarter consolidation scheduling based on memory age and access patterns
+- Improved supersession confidence scoring
 
-Primary integration path: `rclrs` 0.7+ with colcon on ROS2 Jazzy (Ubuntu 24.04).
-Alternative: `r2r` for simpler `cargo build` integration without colcon.
-Optional feature flag: `ros2-native` using `ros2-client` (pure Rust DDS, no ROS2 install needed)
-for distribution to users who do not have a full ROS2 environment.
+---
 
-The echo latency budget is feasible: 3.50ms ShrimPK echo is well within a 30 Hz camera frame
-(33ms). The full pipeline including embedding and topic publish should stay under 15–20ms.
+## Future -- No Fixed Timeline
 
-No other push-based memory system has a ROS2 bridge. ReMEmbR (NVIDIA) is pull-based and
-Python-only. `shrimpk-ros2` would be the first native-Rust, push-activated memory layer for ROS2.
+### Vision model upgrade (CLIP -> Nomic Embed Vision v1.5)
 
-### Speaker upgrade: ECAPA-TDNN → CAM++
+512 -> 768-dim. +7.8pp ImageNet zero-shot. 6x smaller model (62 MB vs 352 MB).
+Breaking migration for stored vision embeddings.
 
-CAM++ (Context-Aware Masking) achieves lower equal error rate than ECAPA-TDNN on VoxCeleb1/2
-at comparable model size. The upgrade is a drop-in replacement at the 512-dim output level
-provided an Apache 2.0-compatible ONNX export is available. If no suitable pre-built ONNX exists,
-the ECAPA-TDNN model ships in v0.7.0 and CAM++ is deferred to v0.8.0.
+### ROS2 bridge -- full implementation
 
-### f16 quantization for vision and speech embeddings
+`shrimpk-ros2` topics for text/image/audio store, echo publish, query service.
+Target: ROS2 Jazzy via rclrs.
 
-Stored vision and speech embeddings currently use f32 (4 bytes/dimension). A v0.7.0 storage
-format revision (SHRM v3) will store these as f16 (2 bytes/dimension) with promotion to f32
-at query time. Impact: ~50% reduction in disk and memory footprint for vision/speech memories,
-no measurable quality loss for cosine similarity.
+### Speaker upgrade (ECAPA-TDNN -> CAM++)
 
-SHRM v3 will include automatic migration from v2 on first launch.
+Lower EER at comparable model size. Blocked on Apache 2.0 ONNX availability.
 
----
+### f16 quantization (SHRM v3)
 
-## Future — No Fixed Timeline
-
-These items are research directions or require dependencies that are not yet settled.
+~50% disk/memory reduction for vision and speech embeddings.
 
 ### Custom fine-tuned embedding model
 
-The text embedding model (BGE-small) is a general-purpose model trained on web text. A model
-fine-tuned specifically on personal memory data (short episodic sentences, user preferences,
-recurring entities) could improve recall quality without increasing model size. This requires
-a labeled dataset and an ML training pipeline — it is a research item, not an implementation task.
+BGE-small fine-tuned on personal memory data for improved recall.
 
 ### crates.io publish
 
-Publishing `shrimpk-core`, `shrimpk-memory`, and (eventually) `shrimpk-ros2` to crates.io
-is planned once the API stabilizes beyond v0.6.0. The current pre-1.0 semver signals that
-breaking changes are expected.
+`shrimpk-core`, `shrimpk-memory` once API stabilizes past v1.0.
 
 ### Cloud sync
 
-Optional encrypted sync of the memory store across devices. End-to-end encrypted, the server
-sees only ciphertext. The key design question is key management — the server must never hold
-decryption keys. This is a future research and design item.
-
-### Emotion channel
-
-The 3-dim arousal/dominance/valence emotion channel is architecturally present in `speech.rs`
-(`EMOTION_DIM=3`) but has no available ONNX model under a permissive license. If a suitable
-Apache 2.0 or MIT model emerges, the emotion channel can be re-enabled without a breaking change
-to the storage format (the slot is reserved). Alternatively, a categorical speech emotion
-recognition model (4-class: angry, happy, sad, neutral) under a permissive license could
-replace the dimensional approach.
+Optional E2E encrypted memory sync across devices.
 
 ---
 
 ## Contribution Opportunities
 
-All issues below are open for contribution. The project uses Apache 2.0. Opening a discussion
-issue before starting significant work is encouraged to avoid duplication.
-
 ### Good first issue
 
-**Fix vision feature flag propagation** (difficulty: low, Rust knowledge required)
-Vision benchmarks (`echo_multimodal_bench.rs`) are blocked because
-`#[cfg(feature = "vision")]` checks the root test crate's features, not `shrimpk-memory`'s.
-The fix is adding a forwarding `vision` feature to the root `Cargo.toml` that enables
-`shrimpk-memory/vision`. Estimated: 1–2 hours.
-
-**Add `search_query:` prefix for cross-modal text queries** (difficulty: low, Rust)
-When Nomic Embed Vision v1.5 is the active vision model (v0.6.0), text queries used in
-cross-modal retrieval must be prefixed with `"search_query: "`. This should be applied
-automatically in `MultiEmbedder` when the Nomic vision model is active, not pushed to callers.
-Requires reading the fastembed API and adding a model-variant check.
-
-**Extend the Tier 2 benchmark with a CrossEncoder config** (difficulty: low, Rust)
-The realistic Tier 2 benchmark tests four pipeline configs (Baseline, HyDE, Reranker-LLM,
-Combined). A CrossEncoder-only config was benchmarked separately and showed strong results
-(2,823ms average at 100% recall on 6 regression cases). Adding it to the standard Tier 2
-suite would complete the comparison matrix.
+- **Fix vision feature flag propagation** -- forwarding `vision` feature to root `Cargo.toml`
+- **CrossEncoder config in Tier 2 benchmark** -- add to standard benchmark suite
 
 ### Help wanted
 
-**Investigate 100K latency regression** (difficulty: medium, Rust + profiling)
-P50 at 100K memories is 23.79ms against a 4.0ms target. Likely causes: LSH bucket saturation
-with BGE-small embedding distribution, brute-force fallback frequency, or Windows I/O interference
-during the benchmark. The investigation should profile LSH hit rate, Bloom false-positive rate,
-and brute-force fallback frequency at scale. Tools: `perf`, `cargo flamegraph`, or the
-`tracing` spans already in the echo path. A fix might involve tuning LSH parameters
-(hash count, bucket width) for the BGE-small distribution.
-
-**~~Wire ECAPA-TDNN ONNX session~~** — DONE (KS51). Wespeaker ResNet34 256-dim, FBank
-preprocessing implemented in pure Rust (`compute_fbank_flat()`), `ort` version matches
-fastembed's pinned `=2.0.0-rc.11`.
-
-**~~Wire Whisper-tiny encoder ONNX session~~** — DONE (KS51). Whisper-tiny encoder takes
-`(1, 80, 3000)` log-Mel spectrogram, outputs `(1, 1500, 384)` hidden states, mean-pooled
-to 384-dim.
-Preprocessing uses the Whisper log-Mel formula implemented in `mel-spec`. Can be done in
-parallel with the ECAPA item by a different contributor.
-
-**Implement band-limited resampling with `rubato`** (difficulty: medium, Rust + DSP)
-Replace `resample_linear()` in `speech.rs` with sinc or FFT-based resampling from the `rubato`
-crate (v1.0.1). The current linear resampler causes aliasing at high downsample ratios and is
-documented as a placeholder. The replacement should pass the existing `resample_*` unit tests
-and add a new test verifying that a 1 kHz sine wave downsampled from 48 kHz to 16 kHz does not
-contain aliasing artifacts above 8 kHz.
-
-**Linux CI hardening** (difficulty: medium, DevOps + Rust)
-The kernel builds and tests pass on CI for Linux and macOS, but the test coverage is lower than
-on the primary Windows development machine. Specifically: daemon startup tests, tray icon tests,
-and file locking tests need Linux-specific validation. Contributions improving Linux CI coverage
-are welcome.
+- **100K latency regression** -- P50 23.79ms vs 4.0ms target. Needs LSH profiling.
+- **Band-limited resampling** -- replace `resample_linear()` with `rubato` sinc resampling
+- **Linux CI hardening** -- daemon startup, file locking, tray icon tests
 
 ### Research needed
 
-**Emotion model under permissive license** (difficulty: high, ML research)
-The 3-dim arousal/dominance/valence emotion slot in the speech pipeline is reserved but empty
-because all mature dimensional emotion models (Wav2Small, wav2vec2-large-robust) carry
-CC-BY-NC-SA-4.0 licenses. Options: (1) identify an existing Apache 2.0 / MIT categorical
-speech emotion model that can be exported to ONNX and mapped to a valence proxy, (2) train a
-small distillation model on CC0 or public-domain audio corpora, or (3) propose an alternative
-paralinguistic dimension that has available permissive models.
-
-**LSH parameter tuning for BGE-small distribution** (difficulty: high, information retrieval)
-The LSH index was tuned for `all-MiniLM-L6-v2` embeddings. The upgrade to `BGE-small-EN-v1.5`
-changed the embedding distribution in ways that may require different hash count, bucket width,
-or candidate list size to maintain sub-10ms P50 at 100K scale. This is an empirical research
-task: vary LSH parameters, run the 100K latency benchmark, and identify the configuration that
-recovers the 4.0ms target.
-
-**CAM++ Apache 2.0 ONNX availability** (difficulty: medium, ML research)
-The v0.7.0 speaker upgrade to CAM++ depends on finding or producing an Apache 2.0-compatible
-ONNX export. WeSpeaker provides CAM++ checkpoints but the license status of any pre-built
-ONNX exports needs verification. This research item should produce a clear verdict: model ID,
-license, ONNX file location, and input/output specification.
-
-**SigLIP 2 fastembed support** (difficulty: high, ML + Rust)
-SigLIP 2 ViT-B/16 achieves 78.2% ImageNet zero-shot (vs Nomic Vision v1.5 at 71.0%) but has
-no official ONNX model and no fastembed support as of March 2026. If an Apache 2.0 ONNX export
-emerges, contributing a `SigLIP2VitB16` variant to fastembed and then updating ShrimPK's
-vision channel would be a meaningful quality improvement.
+- **Emotion model under permissive license** -- 3-dim A/D/V slot reserved, no Apache 2.0 model
+- **LSH parameter tuning for BGE-small** -- hash count, bucket width optimization at 100K scale
+- **SigLIP 2 fastembed support** -- 78.2% ImageNet zero-shot, no ONNX export yet

From 312515e3ee1234f0744c82b7e6dd084237e62c94 Mon Sep 17 00:00:00 2001
From: Lior Cohen <lior1cc@gmail.com>
Date: Fri, 10 Apr 2026 05:22:42 +0300
Subject: [PATCH 2/3] fix: revert accidental hebbian_boosts changes, keep only
 recency epsilon
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Remove merged supersession-demotion changes (hebbian_boosts type change,
  demotion→superseded_count rename, multiplicative demotion) that belong
  on a separate branch
- Keep only the recency tie-breaker epsilon (step 7c7) and google_result
  test fix from PR #13

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 crates/shrimpk-memory/src/echo.rs | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/crates/shrimpk-memory/src/echo.rs b/crates/shrimpk-memory/src/echo.rs
index 3164c77..25bc401 100644
--- a/crates/shrimpk-memory/src/echo.rs
+++ b/crates/shrimpk-memory/src/echo.rs
@@ -1624,6 +1624,15 @@ impl EchoEngine {
             }
         }
 
+        // 7c7. KS78: Recency tie-breaker (#13) — after all boosts and caps, add a
+        // negligible epsilon derived from created_at so newer memories win ties.
+        for result in &mut results {
+            if let Some(entry) = store.get(&result.memory_id) {
+                let recency_epsilon = (entry.created_at.timestamp_micros() as f64) * 1e-18;
+                result.final_score += recency_epsilon;
+            }
+        }
+
         // 7d. Re-sort by final_score (similarity + hebbian boost)
         results.sort_by(|a, b| {
             b.final_score
@@ -3616,9 +3625,12 @@ mod tests {
 
         assert!(results.len() >= 2, "Should have at least 2 results");
 
-        // Find both memories in results
+        // Find both memories in results (the Meta memory also mentions "Google",
+        // so match the Google-only memory by excluding results that mention "Meta")
         let meta_result = results.iter().find(|r| r.content.contains("Meta"));
-        let google_result = results.iter().find(|r| r.content.contains("Google"));
+        let google_result = results
+            .iter()
+            .find(|r| r.content.contains("Google") && !r.content.contains("Meta"));
 
         assert!(meta_result.is_some(), "Meta memory should surface");
         assert!(google_result.is_some(), "Google memory should surface");

From e6297078cd4747943e5c08a038ac6aae63312c86 Mon Sep 17 00:00:00 2001
From: Lior Cohen <lior1cc@gmail.com>
Date: Fri, 10 Apr 2026 16:15:14 +0300
Subject: [PATCH 3/3] =?UTF-8?q?fix:=20ROADMAP=20header=20v0.7.0=E2=86=92v0?=
 =?UTF-8?q?.7.5=20+=20CHANGELOG=20heading=20style=20(Greptile)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- docs/ROADMAP.md: "Current State" header said v0.7.0 but content
  describes v0.7.5 features (12 MCP tools, entity unification, etc.)
- CHANGELOG.md: v0.7.5 entry used "--" instead of em dash "—",
  inconsistent with all other version headings in the file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 CHANGELOG.md    | 2 +-
 docs/ROADMAP.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 284b379..ad5f20d 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,7 +6,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/).
 
 ## [Unreleased]
 
-## [0.7.5] -- 2026-04-10
+## [0.7.5] — 2026-04-10
 
 ### Added
 - **Schema-driven fact extraction** (KS67): structured extraction pipeline replacing free-form LLM output
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
index 9aa8736..988af38 100644
--- a/docs/ROADMAP.md
+++ b/docs/ROADMAP.md
@@ -5,7 +5,7 @@ Dates are aspirational. Contributions welcome -- see Contribution Opportunities
 
 ---
 
-## Current State -- v0.7.0
+## Current State -- v0.7.5
 
 Released April 2026. The core pipeline is stable with 11 crates + CLI, hybrid GraphRAG retrieval, and entity unification.