frankensearch

Two-tier hybrid local search for Rust and the fsfs standalone CLI: fast first-pass results, then quality refinement.

Quick Navigation

Install In One Line
Quick Start
How It Works
Architecture Breakdown
Algorithms Used
Library Integration Quickstart (Rust)
Feature-Flag Decision Table
Config Recipes
Troubleshooting by Symptom
Reference Appendix
Key Source Files
Glossary
Knob Impact Matrix

Install In One Line

curl -fsSL https://raw.githubusercontent.com/Dicklesworthstone/frankensearch/main/install.sh | bash -s -- --easy-mode

Installer goals:

zero-friction first run
auto-configured model cache path
sane defaults for interactive usage

Cargo Install (Developer Path)

fsfs currently builds from this workspace and uses the pinned nightly toolchain (rust-toolchain.toml):

cargo +nightly install --path crates/frankensearch-fsfs
fsfs version
fsfs status --no-watch-mode --format json

Quick Start (60 Seconds)

# 1) Install
curl -fsSL https://raw.githubusercontent.com/Dicklesworthstone/frankensearch/main/install.sh | bash -s -- --easy-mode

# 2) Index a directory
fsfs index ./my-project

# 3) Search
fsfs search "how does retry backoff work" --limit 5

Example output:

PHASE 0 (fast): 5 hits in 12ms
  1. src/retry.rs      score=0.812
  2. docs/failures.md  score=0.774

PHASE 1 (refined): 5 hits in 151ms
  1. src/retry.rs      score=0.923
  2. src/http/client.rs score=0.901

What It Does

frankensearch combines lexical and semantic retrieval with progressive delivery:

lexical BM25 via Tantivy for exact keyword precision
fast semantic tier for immediate relevant hits
quality semantic tier for reranked refinement
reciprocal rank fusion (RRF) to combine sources robustly

Result: responsive first answers plus better final ranking without blocking the UI.

Core Features

Bundled default semantic models (potion-multilingual-128M + all-MiniLM-L6-v2) with optional alternate-model downloads
Progressive search phases (Initial, Refined, RefinementFailed)
Agent-friendly streaming (--stream) with machine-readable output
Result explanation surfaces (fsfs explain <result-id>)
Multiple output formats: table, json, jsonl, toon, csv
Watch/incremental indexing mode for local corpus updates
Portable SIMD vector search + quantized FSVI storage
Optional reranking and ANN paths via feature flags

CLI At A Glance

# Basic search
fsfs search "structured concurrency" --limit 10

# Stream for agents/pipelines
fsfs search "query" --stream --format jsonl

# TOON mode
fsfs search "query" --stream --format toon

# Explain one result
fsfs explain result-123

# Keep index fresh
fsfs index ~/projects --watch

# Health checks
fsfs doctor

Reproducible Showcase Suite

Use this sequence to reproduce the core demo + benchmark evidence bundle:

# Progressive CLI behavior and machine-output surfaces
fsfs search "structured concurrency" --limit 10 --format table
fsfs search "structured concurrency" --limit 10 --stream --format jsonl

# Statistical benchmark regression lane (Tier-3 reproducibility anchor)
cargo test -p frankensearch-fsfs --test benchmark_baseline_matrix -- --nocapture

# Graceful-degradation/fault demonstration lane
cargo test -p frankensearch-fsfs --test pressure_simulation_harness \
  scenario_long_run_soak_fault_injection_stays_within_drift_thresholds -- --nocapture

The benchmark lane validates deterministic artifact identity fields (dataset_sha256, matrix_sha256, samples_sha256) plus a fixed replay command contract.

Configuration

Configuration precedence:

CLI flags
project config file
user config file
environment variables
built-in defaults

Common environment variables:

Variable	Purpose	Example
`FRANKENSEARCH_INDEX_DIR`	Override index/data directory	`~/.local/share/frankensearch`
`FRANKENSEARCH_MODEL_DIR`	Override model location	`~/.cache/frankensearch/models`
`FRANKENSEARCH_FAST_ONLY`	Skip quality refinement	`true`
`FRANKENSEARCH_QUALITY_WEIGHT`	Blend quality vs fast tier	`0.7`
`FRANKENSEARCH_RRF_K`	RRF constant	`60`
`FRANKENSEARCH_LOG`	Tracing filter	`info`

For full contracts and knobs:

docs/fsfs-config-contract.md
docs/fsfs-dual-mode-contract.md
docs/fsfs-packaging-release-install-contract.md
docs/fsfs-packaging-release-install-contract.md#host-migration-playbooks-priority-projects
docs/fsfs-packaging-release-install-contract.md#staged-rollout-and-deterministic-fallback-protocol
docs/fsfs-packaging-release-install-contract.md#upgrade-and-migration-compatibility-verification-strategy
docs/architecture/

How It Works

Pipeline summary:

Query
  -> canonicalize
  -> classify
  -> fast embed + lexical BM25
  -> RRF fusion (initial)
  -> quality embed (top candidates)
  -> blend (and optional rerank)
  -> refined results

Model path used in the default quality lane:

fast tier: potion-128M (or fallback)
fusion: RRF over lexical + semantic ranks
quality tier: MiniLM
optional final rerank: FlashRank cross-encoder

Architecture Breakdown

frankensearch is split into focused crates so each concern can evolve independently:

Crate	Responsibility
`frankensearch-core`	Shared types/traits/errors/config, query canonicalization/classification, metrics/eval helpers
`frankensearch-embed`	Embedding backends and fallback stack (`hash`, `model2vec`, `fastembed`)
`frankensearch-index`	FSVI vector storage, SIMD dot products, top-k search, optional ANN support
`frankensearch-lexical`	Tantivy schema/index/search for BM25 lexical retrieval
`frankensearch-fusion`	RRF fusion, two-tier orchestration, blending, optional rerank integration
`frankensearch-rerank`	Cross-encoder reranking integration
`frankensearch-storage`	FrankenSQLite metadata persistence, dedup/content-hash tracking, embedding queue
`frankensearch-durability`	Repair/protection primitives for index artifacts and segment health
`crates/frankensearch-fsfs`	Standalone CLI product around the library stack
`crates/frankensearch-tui`	Shared TUI shell/input/theme/replay framework used by fsfs/ops
`crates/frankensearch-ops`	Fleet observability/control-plane TUI and telemetry materialization

This separation gives you two options:

use the top-level library as a drop-in engine in your own app
run the full standalone fsfs binary with progressive CLI/TUI workflows

Query Lifecycle (Detailed)

At execution time, the system follows this shape:

Canonicalize and parse the query

Normalize text, strip noise, preserve user intent.
Classify query type (identifier, short keyword, natural language) for adaptive budgets.

Run Phase 1 retrieval

Generate fast embedding (or fallback).
Run lexical BM25 search (if configured).
Run vector top-k on the fast index.
Fuse lexical + semantic candidates with RRF.
Emit Initial results quickly.

Run Phase 2 refinement (unless fast_only)

Generate quality embedding.
Re-score or refine top candidates with quality tier.
Blend fast and quality semantic scores.
Optionally rerank with cross-encoder.
Emit Refined or RefinementFailed (graceful degradation path).

Algorithms Used

1. Reciprocal Rank Fusion (RRF)

RRF is rank-based and model-agnostic. It does not require score calibration across systems:

RRF(doc) = Σ_sources 1 / (K + rank(doc, source) + 1)

Default K is 60 (configurable with FRANKENSEARCH_RRF_K / rrf_k).

Why RRF:

robust to score-scale mismatch between BM25 and vector similarity
simple, stable, and strong in practice
rewards documents that appear in multiple retrieval channels

2. Two-Tier Score Blending

During refinement, fast and quality semantic scores are normalized then blended:

blended_score = alpha * quality_score + (1 - alpha) * fast_score

alpha is controlled by quality_weight (default target 0.7).

3. Deterministic Ordering

When ties happen, ranking remains deterministic through stable tie-break logic (including lexical comparison and doc_id ordering), which helps replayability and makes diff-based evaluation much cleaner.

Index Format and Retrieval Internals

FSVI Storage

Vector data is stored in FSVI files with memory-mapped access:

primary default quantization: f16 (good memory/quality tradeoff)
optional f32 paths where needed
explicit metadata and format checks to catch corruption early

Why this matters:

lower memory footprint than full f32 storage in common workloads
fast startup and read paths via mmap
predictable, portable on-disk format for local search use

Vector Top-K Implementation

The brute-force search path is optimized around:

SIMD dot products
bounded heap selection for top-k
NaN-safe total ordering
two-phase work: score first, materialize doc ids for winners

This gives strong baseline behavior while ANN remains optional for larger corpora.

Runtime and Concurrency Model

The async model uses asupersync and capability context (Cx), not Tokio.

Important implications:

host-controlled runtime ownership: your app provides runtime/cancellation context
cancellation-aware search phases and timeouts
no hard Tokio coupling in public contracts

This is useful if you need to embed search inside existing non-Tokio runtimes or strictly controlled execution environments.

Design Principles

Core engineering principles in this project:

Progressive delivery first
Fast initial answer, then quality refinement, instead of blocking on best possible ranking.
Graceful degradation
If quality tier/reranker/model loading fails, search still returns useful initial results.
Determinism and reproducibility
Stable ordering and artifact-driven evaluation support regression tracking and CI gates.
Explicit tradeoffs over hidden magic
Key knobs (rrf_k, blend weight, fast-only mode, candidate multipliers) are visible and tunable.
Practical hybrid retrieval
BM25 and embeddings are treated as complementary signals, not mutually exclusive choices.

Why This Is Useful

frankensearch is especially strong when you need:

intent-level recall beyond exact grep matching
low-latency first results for interactive workflows
local-first operation with optional model download only
agent-friendly output contracts (jsonl, toon) and explainability hooks
one reusable engine across multiple products instead of bespoke search stacks

In short: it closes the gap between exact text lookup and semantic retrieval without forcing you into remote services or heavyweight distributed systems.

Tuning Playbook

Common tuning patterns:

Need lower tail latency:
- enable FRANKENSEARCH_FAST_ONLY=true
- reduce candidate budget and rerank depth
- keep lexical enabled for exact-match recovery
Need higher relevance quality:
- keep quality tier enabled
- increase candidate multiplier
- enable reranking for top-N only
Need memory efficiency:
- stay on f16 index quantization
- use ANN selectively above practical corpus thresholds
Need operational clarity:
- emit JSONL/TOON for automation
- monitor phase timings and drift through test harness artifacts

Quality and Evaluation Discipline

The repository includes explicit quality harnesses and statistical checks:

IR metrics: nDCG@K, MRR, Recall@K, plus bootstrap confidence intervals
profile comparisons with deterministic manifests
benchmark/e2e artifact contracts for repeatable regression analysis

This keeps tuning decisions evidence-driven rather than anecdotal.

Limits and Tradeoffs

Being explicit about scope helps set expectations:

Hybrid search is more complex than plain rg and has model/runtime overhead.
Quality-tier refinement improves ranking but increases latency.
ANN helps at larger scale but adds index lifecycle complexity.
Semantic quality depends on corpus characteristics and embedding model fit.

Use rg/grep for strict exact matching and frankensearch when ranking by intent and contextual relevance matters.

Library Integration Quickstart (Rust)

If you want to embed frankensearch directly in your Rust app, this is the minimum end-to-end flow:

use std::sync::Arc;

use frankensearch::{
    Embedder, EmbedderStack, HashEmbedder, IndexBuilder, TwoTierConfig, TwoTierIndex,
    TwoTierSearcher,
};

asupersync::test_utils::run_test_with_cx(|cx| async move {
    // 1) Choose embedders (hash embedders shown for zero-dependency quickstart)
    let fast = Arc::new(HashEmbedder::default_256()) as Arc<dyn Embedder>;
    let quality = Arc::new(HashEmbedder::default_384()) as Arc<dyn Embedder>;
    let stack = EmbedderStack::from_parts(fast, Some(quality));

    // 2) Build an index from documents
    IndexBuilder::new("./my_index")
        .with_embedder_stack(stack)
        .add_document("doc-1", "Rust ownership and borrowing")
        .add_document("doc-2", "Structured concurrency with asupersync")
        .build(&cx)
        .await
        .expect("index build should succeed");

    // 3) Open and search
    let index = Arc::new(TwoTierIndex::open("./my_index", TwoTierConfig::default()).unwrap());
    let searcher = TwoTierSearcher::new(
        index,
        Arc::new(HashEmbedder::default_256()) as Arc<dyn Embedder>,
        TwoTierConfig::default(),
    );

    let (results, metrics) = searcher
        .search_collect(&cx, "ownership rules", 10)
        .await
        .expect("search should succeed");

    println!("results={} phase1_ms={:.2}", results.len(), metrics.phase1_total_ms);
});

Notes:

For production semantic quality, use model2vec + fastembed (or EmbedderStack::auto_detect_with).
Use search_collect_with_text or full search(...) when you need negation filtering (-term) and rerank text access.
Keep TwoTierConfig explicit in code for reproducible behavior across environments.

Baseline Performance Envelope (Reference)

These are practical CPU-only reference numbers for a healthy local setup. Treat them as orientation targets, not hard SLAs:

Operation	Typical Envelope
Fast hash embedding	~11 μs
Fast model embedding (potion-128M)	~0.57 ms
Quality model embedding (MiniLM)	~128 ms
Vector search (10K docs, top-10)	~2 ms
RRF fusion (500 + 500 candidates)	~1 ms
Phase 1 initial delivery	usually `< 15 ms` target
Phase 2 refined delivery	usually `~150 ms` target

What changes the envelope the most:

query class and candidate budget
corpus size and document length distribution
enabled features (lexical, rerank, ann)
model tier selection and cache state

Failure Modes and Degradation Behavior

The engine is intentionally designed to degrade gracefully:

Condition	Behavior	What Caller Sees
Quality refinement timeout	Phase 2 aborts safely	`SearchPhase::RefinementFailed { error: SearchTimeout, ... }`
Quality embedder errors	Initial results preserved	`SearchPhase::RefinementFailed { ... }`
`fast_only=true`	Skip quality phase by design	only `Initial` phase + `skip_reason="fast_only"`
No quality embedder configured	Skip quality phase	only `Initial` phase + `skip_reason="no_quality_embedder"`
Fast embedder fails but lexical succeeds	lexical-only fallback	valid `Initial` results from lexical path
Fast embedder fails and no lexical fallback	hard failure	search returns error
Lexical backend failure	semantic continues	search continues without lexical contribution

Practical implication: phase-1 UX can still stay responsive even when higher-cost quality paths fail.

Production Deployment Checklist

Use this as a pragmatic hardening pass before rollout:

Pin explicit feature set (semantic, hybrid, full, etc.) and toolchain.
Decide runtime mode per environment (fast_only in latency-critical paths, full two-tier where quality matters).
Set FRANKENSEARCH_MODEL_DIR to a stable writable path with enough disk.
Enable structured logs/metrics (FRANKENSEARCH_LOG) and capture phase timings.
Run quality gates:
- cargo fmt --check
- cargo check --workspace --all-targets
- cargo clippy --workspace --all-targets -- -D warnings
- cargo test --workspace
Run benchmark and quality harnesses on representative corpora before release.
Validate degradation behavior by intentionally forcing quality timeout/failure.
For large corpora, evaluate ANN thresholding and memory budget explicitly.
Keep reproducible artifacts for before/after tuning comparisons.

Crates.io Publishing (CI)

The publish-crates lane in .github/workflows/ci.yml is intentionally gated and opt-in.

Required setup:

Repository variable: ENABLE_CRATES_PUBLISH=true
Repository variable: CRATES_PUBLISH_SEQUENCE (space-separated crate names in dependency order)
Repository secret: CARGO_REGISTRY_TOKEN (crates.io publish token)

Behavior:

Runs only on stable v* tags (skips prerelease tags containing -).
Verifies tag/version alignment for each crate in CRATES_PUBLISH_SEQUENCE.
Runs per-crate cargo publish --dry-run checks before real publish.
Publishes crates sequentially to reduce crates.io index race failures.

Current boundary:

Crates with local-only fsqlite* path dependencies are not publishable to crates.io until those dependencies are available as versioned crates. Start CRATES_PUBLISH_SEQUENCE with publishable crates only.

Troubleshooting by Symptom

Symptom	Likely Cause	What To Check
Initial results are slow	Candidate budget too high, cold cache, oversized corpus	candidate multipliers, model cache warm-up, corpus and index size
No refined phase arrives	`fast_only` enabled, no quality embedder, timeout	`FRANKENSEARCH_FAST_ONLY`, quality model availability, `quality_timeout_ms`
`RefinementFailed` appears often	quality model unstable/slow, timeout budget too tight	model path/logs, timeout config, CPU contention
Results feel exact-match heavy	lexical overweight via candidate mix, weak semantic model tier	embedder stack selection, query class budgets, quality tier availability
Results feel semantically off	embedding model mismatch for corpus domain	switch model tier, tune blend weight, add rerank
Negation queries (`-term`) behave unexpectedly	missing text provider in convenience path	use `search_collect_with_text`/`search(...)` with text callback
Output parsing issues in automation	wrong format for downstream parser	use `--format jsonl` or `--format toon` consistently
High memory usage	large index + quality/rerank/ANN enabled	feature set, f16 defaults, ANN config, corpus scope

Sequence Diagram (Mermaid)

sequenceDiagram
    autonumber
    participant U as User/Caller
    participant S as TwoTierSearcher
    participant C as Canonicalizer+Parser
    participant F as Fast Embedder
    participant L as Lexical Backend
    participant V as Vector Index
    participant R as RRF Fusion
    participant Q as Quality Embedder
    participant B as Blend Stage
    participant X as Optional Reranker

    U->>S: search(query, k)
    S->>C: canonicalize + classify + parse
    par Fast semantic path
        S->>F: embed(query)
        F-->>S: fast query vector
        S->>V: search_fast(top_k * multiplier)
        V-->>S: semantic candidates
    and Lexical path (if enabled)
        S->>L: BM25 search(top_k * multiplier)
        L-->>S: lexical candidates
    end

    S->>R: fuse(lexical, semantic, rrf_k)
    R-->>S: initial ranked results
    S-->>U: SearchPhase::Initial

    alt fast_only OR no quality embedder
        S-->>U: done (initial only)
    else quality refinement enabled
        S->>Q: embed(query) with timeout
        alt quality success
            Q-->>S: quality query vector
            S->>V: quality_scores_for_indices(...)
            V-->>S: quality scores
            S->>B: blend(quality_weight)
            opt rerank feature enabled
                S->>X: rerank(top_n)
                X-->>S: reranked results
            end
            S-->>U: SearchPhase::Refined
        else quality timeout/failure
            S-->>U: SearchPhase::RefinementFailed
        end
    end

Feature-Flag Decision Table

These are crate feature flags from frankensearch/Cargo.toml:

Goal	Recommended Feature Set	Why
Fastest dev loop / CI smoke checks	`default` (`hash`)	zero model downloads, minimal deps
Better semantic quality without lexical	`semantic`	enables `hash + model2vec + fastembed`
Hybrid retrieval (semantic + BM25)	`hybrid`	adds lexical precision on top of semantic recall
Persistent local indexing	`persistent`	`hybrid + storage` for durable metadata/queues
Durable + self-healing stack	`durable`	`persistent + durability`
Full capability surface	`full`	`durable + rerank + ann + download`
Full stack + FTS5 storage backend	`full-fts5`	`full + fts5` for advanced local SQL FTS paths

Quick examples:

# Hybrid local search library build
cargo build -p frankensearch --features hybrid

# Full stack with ANN + rerank + download
cargo build -p frankensearch --features full

# Full stack plus FTS5
cargo build -p frankensearch --features full-fts5

Config Recipes

1) Latency-First Profile

Best for interactive UX where fast first answer matters most.

export FRANKENSEARCH_FAST_ONLY=true
export FRANKENSEARCH_QUALITY_WEIGHT=0.7
export FRANKENSEARCH_RRF_K=60
export FRANKENSEARCH_QUALITY_TIMEOUT=250

Operational effect:

returns Initial quickly and skips/limits expensive refinement behavior
predictable low-latency tail for chat/assistant loops

2) Quality-First Profile

Best for offline analysis, report generation, or high-precision ranking.

export FRANKENSEARCH_FAST_ONLY=false
export FRANKENSEARCH_QUALITY_WEIGHT=0.85
export FRANKENSEARCH_RRF_K=40
export FRANKENSEARCH_QUALITY_TIMEOUT=1200

Operational effect:

gives quality tier and rerank more room to improve ordering
increases median and tail latency

3) Memory/CPU-Conservative Profile

Best for constrained laptops or multi-tenant CI hosts.

export FRANKENSEARCH_FAST_ONLY=true
export FRANKENSEARCH_QUALITY_TIMEOUT=200
export FRANKENSEARCH_HNSW_THRESHOLD=200000

Operational effect:

reduces expensive quality-phase work
avoids ANN build/search overhead for smaller corpora

Optional: `TwoTierConfig::optimized()` TOML Recipe

For library consumers using TwoTierConfig::optimized(), place a file at data/optimized_params.toml:

quality_weight = 0.8
rrf_k = 50.0
candidate_multiplier = 4
quality_timeout_ms = 800
fast_only = false
explain = false
hnsw_ef_search = 100
hnsw_ef_construction = 200
hnsw_m = 16
hnsw_threshold = 50000
mrl_search_dims = 0
mrl_rescore_top_k = 30

Use this when you want deterministic, checked-in tuning presets instead of host-specific env var drift.

Reference Appendix

Key Source Files

Area	Source File	Purpose
Facade crate	`frankensearch/src/lib.rs`	Top-level public API surface and re-exports
Index build workflow	`frankensearch/src/index_builder.rs`	High-level corpus-to-index pipeline
Progressive orchestration	`crates/frankensearch-fusion/src/searcher.rs`	Phase 1/2 flow, fallback paths, telemetry
Rank fusion	`crates/frankensearch-fusion/src/rrf.rs`	Reciprocal Rank Fusion implementation
Two-tier blending	`crates/frankensearch-fusion/src/blend.rs`	Fast/quality score normalization and blending
Two-tier index wrapper	`crates/frankensearch-index/src/two_tier.rs`	Fast/quality index alignment and lookup
Top-k vector search	`crates/frankensearch-index/src/search.rs`	Heap-based top-k selection and scoring paths
On-disk vector format	`crates/frankensearch-index/src/format.rs`	FSVI format, quantization, mmap reads
Core config knobs	`crates/frankensearch-core/src/config.rs`	`TwoTierConfig`, defaults, env overrides
Core result types	`crates/frankensearch-core/src/types.rs`	`SearchPhase`, `ScoredResult`, hit structs
Query classification	`crates/frankensearch-core/src/query_class.rs`	Query-type detection and adaptive budgets
Eval/statistics	`crates/frankensearch-core/src/metrics_eval.rs`	nDCG/MRR/Recall/MAP + bootstrap helpers
Embedder auto-detect	`crates/frankensearch-embed/src/auto_detect.rs`	Fast/quality model discovery and stack setup
Storage ingest/queue	`crates/frankensearch-storage/src/pipeline.rs`	Storage-backed ingestion, queue processing, embedding sinks
Durability repair layer	`crates/frankensearch-durability/src/fsvi_protector.rs`	Protect/verify/repair flows for vector artifacts
FSFS CLI entry	`crates/frankensearch-fsfs/src/lib.rs`	Standalone CLI product wiring
FSFS runtime orchestration	`crates/frankensearch-fsfs/src/runtime.rs`	Command dispatch, search/index execution, stream emission
Shared TUI shell	`crates/frankensearch-tui/src/shell.rs`	Reusable shell loop/navigation/overlay plumbing
Ops telemetry storage	`crates/frankensearch-ops/src/storage.rs`	Control-plane telemetry persistence/materialization

Glossary

Term	Meaning
Two-tier search	Progressive retrieval: fast initial pass, quality refinement pass
Phase 1 / `Initial`	First emitted result set, optimized for low latency
Phase 2 / `Refined`	Optional improved ranking after quality embedding
`RefinementFailed`	Graceful degradation event when Phase 2 errors/times out
RRF	Reciprocal Rank Fusion combining lexical + semantic rank lists
BM25	Lexical ranking function used by Tantivy backend
FSVI	On-disk vector index format used by frankensearch-index
`f16` quantization	Half-precision storage mode reducing memory footprint
`TwoTierIndex`	Wrapper over fast and optional quality vector indexes
`TwoTierSearcher`	Main orchestrator that runs retrieval/fusion/refinement
`TwoTierConfig`	Primary tuning config for latency/quality behavior
`TwoTierMetrics`	Per-search diagnostics (phase timings, candidate counts, skip reason)
`EmbedderStack`	Paired fast + optional quality embedder selection object
`Cx`	asupersync capability context passed into async operations

Knob Impact Matrix

Knob	Where Set	Primary Impact	Increase Tends To	Decrease Tends To
`quality_weight`	`TwoTierConfig`, `FRANKENSEARCH_QUALITY_WEIGHT`	Blend balance	Favor quality-tier ranking signal	Favor fast-tier ranking signal
`rrf_k`	`TwoTierConfig`, `FRANKENSEARCH_RRF_K`	RRF rank sensitivity	Flatten rank differences across sources	Emphasize top ranks more strongly
`candidate_multiplier`	`TwoTierConfig`	Candidate pool size	Improve recall headroom, increase latency/work	Reduce latency/work, may reduce recall
`quality_timeout_ms`	`TwoTierConfig`, `FRANKENSEARCH_QUALITY_TIMEOUT`	Phase 2 budget	More chances to finish refinement	More `RefinementFailed` timeouts
`fast_only`	`TwoTierConfig`, `FRANKENSEARCH_FAST_ONLY`	Phase behavior	Skip Phase 2 entirely (`true`)	Enable Phase 2 when quality embedder exists (`false`)
`hnsw_threshold`	`TwoTierConfig`, `FRANKENSEARCH_HNSW_THRESHOLD`	ANN activation point	Use brute-force for more corpus sizes	Use ANN earlier for large corpora
`hnsw_ef_search`	`TwoTierConfig`	ANN query beam width	Better ANN recall, more latency	Lower latency, potentially lower recall
`mrl_search_dims`	`TwoTierConfig`	MRL scan dimensionality	Better first-pass quality, more compute	Faster first-pass, potentially less quality
`mrl_rescore_top_k`	`TwoTierConfig`	Full-dim rescore scope	Better refined ordering, more compute	Less compute, potentially weaker refinement
`lexical` feature	Cargo feature	Hybrid retrieval capability	Better exact-match precision and fallback paths	Semantic-only behavior
`rerank` feature	Cargo feature	Cross-encoder rerank	Better top-result precision, higher latency	Lower latency, less fine-grained top ordering
`ann` feature	Cargo feature	Approximate nearest-neighbor path	Better scale behavior at large corpus sizes	Simpler exact brute-force behavior

Why Not Just grep/ripgrep/ctags?

grep/ripgrep/ctags are excellent for exact text and symbol lookup. frankensearch solves a different problem: semantic intent search over mixed corpora.

Tool	Strong At	Limitation vs frankensearch
`grep`	exact substrings	no semantic similarity
`ripgrep`	very fast regex search	no embedding-based recall
`ctags`	symbol navigation	not document-level semantic ranking
`frankensearch/fsfs`	hybrid semantic + lexical, progressive refinement	higher complexity/runtime footprint

Use both: keep rg for exact matches and use fsfs for intent-level retrieval.

FAQ

Does it run fully local?

Yes. Search/indexing runs on your machine. Network access is only needed for optional alternate-model downloads and update checks.

Can I use only the library and skip `fsfs`?

Yes. Add frankensearch as a dependency and wire your own app/runtime.

What if the quality model is unavailable?

Search still works using fast-tier and lexical paths; you get RefinementFailed or fast-only behavior.

Which output format should agents use?

Use jsonl for streaming automation and toon if your downstream stack expects TOON semantics.

Is this tied to Tokio?

No. Async/concurrency is built around asupersync and Cx.

Contributing

Project policy is no direct external merges, but issues and PRs are still useful for bug reports and proposal clarity.

If you are working inside this repository as an internal/automation agent:

cargo fmt --check
cargo check --workspace --all-targets
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace

Useful docs:

AGENTS.md
docs/e2e-artifact-contract.md
docs/dependency-semantics-policy.md

License

MIT License (with OpenAI/Anthropic Rider). See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 300 Commits
.beads		.beads
.github/workflows		.github/workflows
.rch		.rch
crates		crates
data		data
docs		docs
frankensearch		frankensearch
schemas		schemas
scripts		scripts
tests		tests
tools/optimize_params		tools/optimize_params
.agent-mail.yaml		.agent-mail.yaml
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
UPGRADE_LOG.md		UPGRADE_LOG.md
frankensearch_illustration.webp		frankensearch_illustration.webp
gh_og_share_image.png		gh_og_share_image.png
install.sh		install.sh
rust-toolchain.toml		rust-toolchain.toml
uninstall.sh		uninstall.sh

Folders and files

Latest commit

History

Repository files navigation

frankensearch

Quick Navigation

Install In One Line

Cargo Install (Developer Path)

Quick Start (60 Seconds)

What It Does

Core Features

CLI At A Glance

Reproducible Showcase Suite

Configuration

How It Works

Architecture Breakdown

Query Lifecycle (Detailed)

Algorithms Used

1. Reciprocal Rank Fusion (RRF)

2. Two-Tier Score Blending

3. Deterministic Ordering

Index Format and Retrieval Internals

FSVI Storage

Vector Top-K Implementation

Runtime and Concurrency Model

Design Principles

Why This Is Useful

Tuning Playbook

Quality and Evaluation Discipline

Limits and Tradeoffs

Library Integration Quickstart (Rust)

Baseline Performance Envelope (Reference)

Failure Modes and Degradation Behavior

Production Deployment Checklist

Crates.io Publishing (CI)

Troubleshooting by Symptom

Sequence Diagram (Mermaid)

Feature-Flag Decision Table

Config Recipes

1) Latency-First Profile

2) Quality-First Profile

3) Memory/CPU-Conservative Profile

Optional: TwoTierConfig::optimized() TOML Recipe

Reference Appendix

Key Source Files

Glossary

Knob Impact Matrix

Why Not Just grep/ripgrep/ctags?

FAQ

Does it run fully local?

Can I use only the library and skip fsfs?

What if the quality model is unavailable?

Which output format should agents use?

Is this tied to Tokio?

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Optional: `TwoTierConfig::optimized()` TOML Recipe

Can I use only the library and skip `fsfs`?

Packages