Skip to content

perf(db): migrate Repository read queries to NativeDatabase rusqlite (6.14)#671

Merged
carlos-alm merged 5 commits intomainfrom
feat/native-read-queries
Mar 29, 2026
Merged

perf(db): migrate Repository read queries to NativeDatabase rusqlite (6.14)#671
carlos-alm merged 5 commits intomainfrom
feat/native-read-queries

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Implement all 40 Repository read methods as Rust napi-rs methods on NativeDatabase, so every query runs via rusqlite when the native engine is available
  • openRepo() now returns NativeRepository (delegating to NativeDatabase.openReadonly) with automatic fallback to SqliteRepository when native is unavailable
  • Rust side: read_types.rs defines 14 napi return-type structs; read_queries.rs implements 40 query methods including BFS for getClassHierarchy and dynamic SQL for triage/fan-in queries
  • TypeScript side: NativeRepository maps napi camelCase rows back to the snake_case field names the Repository interface expects

Test plan

  • TypeScript compiles clean (tsc --noEmit)
  • All 2129 existing tests pass (115 test files)
  • Lint passes (only pre-existing warnings)
  • CI validates Rust compilation (local Windows bash has PATH conflict with MSVC linker)
  • Parity test suite (to be added in follow-up once CI confirms Rust builds)
  • Dogfood: codegraph build && codegraph stats produces identical results on both engines

…(6.14)

Implement all 40 Repository read methods as Rust napi-rs methods on
NativeDatabase so every query runs via rusqlite when the native engine
is available. openRepo() now returns NativeRepository (delegating to
NativeDatabase.openReadonly) with automatic fallback to SqliteRepository.

Rust side: read_types.rs defines 14 napi return-type structs,
read_queries.rs implements 40 query methods including BFS for
getClassHierarchy and dynamic SQL for triage/fan-in queries.

TypeScript side: NativeRepository maps napi camelCase rows back to
the snake_case field names the Repository interface expects.
@claude
Copy link
Copy Markdown

claude bot commented Mar 28, 2026

Claude finished @carlos-alm's task —— View job


PR Review Task List

  • Phase 1: Root cause analysis - understand problem being solved
  • Phase 2: Configuration change detection - check for risky config changes
  • Phase 3: Test integrity - verify no weakened assertions
  • Phase 4: Code quality review - analyze Rust and TypeScript implementations
  • Phase 5: Backlog compliance assessment
  • Final assessment and rating

Root Cause Analysis

Problem Being Solved: Performance optimization by migrating all 40 Repository read queries from JavaScript/better-sqlite3 to native Rust/rusqlite via napi-rs, reducing JS-Rust boundary crossings.

Why This Approach: Dual-engine architecture requires parity. Native queries run via optimized rusqlite when available, with automatic fallback to better-sqlite3.

Risk Assessment: Medium - This is a significant change (1993 lines added) that reimplements core database operations in Rust. Risk mitigated by comprehensive fallback mechanisms.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies added (uses existing napi-rs infrastructure)
  • Foundation-aligned: ✓ — Fully compliant with P1 (always-current), P4 (zero-cost core), P5 (embeddable-first). No cloud APIs, external DBs, or code generation. Maintains programmatic API contract.
  • Problem-fit: 4/5 — Directly improves agent accuracy by reducing query latency and improving reliability of database operations. Performance gains reduce round-trips for common tasks.
  • Breaking: No — Purely additive with fallback. No CLI output changes, API signature changes, or DB schema modifications.
  • Tier: 1 (zero-dep ✓ + foundation-aligned ✓)

Critical Concerns

None identified. This is a well-implemented performance optimization with proper error handling, security measures, and backwards compatibility.

Code Quality Highlights:

  • SQL Injection Prevention: Parameterized queries with proper escaping
  • Memory Safety: Type-safe parameter handling throughout
  • Error Handling: Comprehensive error propagation with descriptive messages
  • Performance: Statement caching for all 38 query methods
  • Type Safety: Complete field mapping between Rust and TypeScript interfaces
  • Security: Input validation against hardcoded whitelists

Final Recommendation

  • Rating: ⭐⭐⭐⭐⭐ (5/5)
  • Action: APPROVE
  • Reasoning: Exceptional implementation of a significant performance optimization. Zero new dependencies, comprehensive error handling, proper security measures, complete backwards compatibility, and direct improvement to core functionality. This represents exactly the kind of optimization that benefits AI coding assistants by reducing database query latency.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 28, 2026

Greptile Summary

This PR completes Phase 6.14 by implementing all 40 Repository read methods as native Rust napi-rs methods on NativeDatabase (via rusqlite), and wiring openRepo() to prefer the native path when available with an automatic fallback to better-sqlite3.

Key changes:

  • read_types.rs: 14 #[napi(object)] structs mirroring every TypeScript row type used by the Repository interface.
  • read_queries.rs: 1230-line second #[napi] impl NativeDatabase block covering counters, single-row lookups, list queries with dynamic optional filters, edge traversals, a BFS class-hierarchy walk, and optional-table probes (hasCfgTables, hasEmbeddings, hasDataflowTable). All statements use prepare_cached for automatic reuse.
  • native-repository.ts: NativeRepository class that delegates each method to the corresponding NativeDatabase napi method and maps napi-rs camelCase fields back to the snake_case names the Repository interface expects.
  • connection.ts / openRepo(): attempts NativeDatabase.openReadonly first; re-throws DbError (user-visible) and silently falls back to better-sqlite3 only for native-engine failures.
  • The prior review concern about has_cfg_tables/has_embeddings/has_dataflow_table swallowing all errors has been addressed — they now only return false for SqliteFailure and QueryReturnedNoRows, propagating genuine I/O errors.

The main finding in this review is a consistency issue in get_class_hierarchy: the inner BFS loop uses .filter_map(|r| r.ok()) to collect parent IDs, silently discarding any row-level error, while every other multi-row query in the file propagates errors via .collect::<Result<Vec<_>, _>>(). A low-level I/O error during BFS traversal would silently produce an incomplete (wrong) ancestry set rather than an error.

Confidence Score: 5/5

Safe to merge — all findings are P2 style/defensive concerns with no correctness impact on the normal execution path.

The prior P0 concern (error-swallowing in has_cfg_tables/has_embeddings/has_dataflow_table) is resolved. The two remaining findings are P2: a silent-error pattern in the BFS that would only affect results under a rare low-level I/O failure, and a style inconsistency in get_callable_nodes. All parameterized queries are correctly bound, the escape_like helper mirrors the TypeScript original, and the camelCase→snake_case mapping in NativeRepository is thorough and accurate.

crates/codegraph-core/src/read_queries.rs — specifically get_class_hierarchy (line 904) and get_callable_nodes (line 1014).

Important Files Changed

Filename Overview
crates/codegraph-core/src/read_queries.rs New 1230-line file implementing all 40 Repository read methods in Rust via rusqlite; BFS in get_class_hierarchy silently discards row errors unlike every other query in the file, and get_callable_nodes builds its static IN clause via string interpolation instead of a literal constant.
crates/codegraph-core/src/read_types.rs Defines 14 napi(object) structs mirroring the TypeScript Repository row types; all fields map correctly to their TS counterparts.
src/db/repository/native-repository.ts NativeRepository delegating all Repository read methods to NativeDatabase with correct camelCase→snake_case converters; toRelatedNodeRow correctly uses the optional end_line field.
src/db/connection.ts openRepo() updated to prefer NativeDatabase/NativeRepository when native is available, with correct DbError re-throw and fallback to better-sqlite3.
crates/codegraph-core/src/native_db.rs One-line visibility change: conn() promoted from fn to pub(crate) fn to allow access from the new read_queries module.
crates/codegraph-core/src/lib.rs Adds pub mod declarations for the two new modules (read_queries, read_types).
src/db/repository/index.ts Barrel re-export updated to add NativeRepository alongside the existing SqliteRepository export.
src/types.ts Adds NativeDatabase interface methods and Native* row type declarations needed by NativeRepository.

Sequence Diagram

sequenceDiagram
    participant TS as TypeScript caller
    participant OR as openRepo()
    participant NR as NativeRepository
    participant ND as NativeDatabase (Rust)
    participant RQ as rusqlite
    participant FB as SqliteRepository (fallback)

    TS->>OR: openRepo(dbPath?)
    OR->>OR: isNativeAvailable()?
    alt native available
        OR->>ND: NativeDatabase.openReadonly(dbPath)
        ND->>RQ: open readonly connection
        OR->>NR: new NativeRepository(ndb)
        OR-->>TS: { repo: NativeRepository, close() }
        TS->>NR: repo.findCallees(nodeId)
        NR->>ND: ndb.findCallees(nodeId)
        ND->>RQ: prepare_cached + query_map
        RQ-->>ND: Vec<NativeRelatedNodeRow>
        ND-->>NR: camelCase JS objects
        NR->>NR: toRelatedNodeRow() (camelCase to snake_case)
        NR-->>TS: RelatedNodeRow[]
    else native unavailable or engine error (non-DbError)
        OR->>FB: new SqliteRepository(better-sqlite3 db)
        OR-->>TS: { repo: SqliteRepository, close() }
    end
Loading

Reviews (2): Last reviewed commit: "fix: distinguish table-not-found from I/..." | Re-trigger Greptile

Comment on lines +1138 to +1145
}
}

/// Get complexity metrics for a node.
#[napi]
pub fn get_complexity_for_node(
&self,
node_id: i32,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 has_cfg_tables swallows all errors indiscriminately

The current implementation catches any rusqlite error (including I/O errors or a closed connection) and silently returns false. A real connection error would be misreported as "CFG tables don't exist". The same applies to has_embeddings and has_dataflow_table. Consider distinguishing between "table doesn't exist" and genuine I/O failures:

pub fn has_cfg_tables(&self) -> napi::Result<bool> {
    let conn = self.conn()?;
    match conn.prepare("SELECT 1 FROM cfg_blocks LIMIT 0") {
        Ok(_) => Ok(true),
        Err(rusqlite::Error::SqliteFailure(_, _)) => Ok(false),
        Err(e) => Err(napi::Error::from_reason(format!("has_cfg_tables: {e}"))),
    }
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3d337b7has_cfg_tables, has_embeddings, and has_dataflow_table now match on rusqlite::Error::SqliteFailure (and QueryReturnedNoRows for embeddings) and return false only for those. Genuine I/O or connection errors are propagated as napi errors.

carlos-alm added a commit that referenced this pull request Mar 29, 2026
…ow, NativeAdjacentEdgeRow (#671)

The nodes table allows NULL for line, but these three Rust structs
declared line as non-optional i32. Any node with a NULL line (e.g.
file-kind nodes) would trigger a rusqlite type-mismatch error. Change
to Option<i32> in Rust and number | null in TypeScript, with ?? 0
fallback in the converter functions to match existing Repository
interface contracts.
carlos-alm added a commit that referenced this pull request Mar 29, 2026
Re-throw DbError (e.g. DB not found) instead of silently falling back
to better-sqlite3, which would produce a misleading debug message.
Also wrap the ndb setup block in try/catch to close the handle if an
exception occurs after openReadonly, preventing resource leaks.
carlos-alm added a commit that referenced this pull request Mar 29, 2026
has_cfg_tables, has_embeddings, and has_dataflow_table previously caught
all rusqlite errors and returned false. Now only SqliteFailure (and
QueryReturnedNoRows for embeddings) map to false — genuine I/O or
connection errors propagate as napi errors.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed all Greptile review findings:

P1: Non-optional line: i32 in NativeNodeIdRow/NativeRelatedNodeRow/NativeAdjacentEdgeRow — Fixed in ede394a. Changed line to Option<i32> in all three Rust structs, updated the TypeScript native types to number | null, and added ?? 0 fallback in the three converter functions (toNodeIdRow, toRelatedNodeRow, toAdjacentEdgeRow) to match the downstream Repository interface contracts. This prevents rusqlite type-mismatch panics on nodes with NULL line values.

P1: DbError for missing DB silently swallowed by native catch block — Fixed in 3418539. The catch block now re-throws DbError instances so that user-visible errors (like DB not found) are not silently swallowed. Only native-engine failures (e.g. incompatible binary) trigger the fallback to better-sqlite3.

P2: ndb not closed if exception thrown after openReadonly — Fixed in 3418539. The ndb handle is now wrapped in a try/catch that calls ndb.close() before re-throwing, preventing file descriptor and connection leaks.

P2: has_cfg_tables swallows all errors — Fixed in 3d337b7. All three table-check methods (has_cfg_tables, has_embeddings, has_dataflow_table) now match specifically on rusqlite::Error::SqliteFailure (and QueryReturnedNoRows for embeddings). Genuine I/O or connection errors are propagated as napi errors instead of being silently swallowed.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@claude

@claude
Copy link
Copy Markdown

claude bot commented Mar 29, 2026

Claude finished @carlos-alm's task —— View job


Review of Greptile Finding Fixes ✅

Task: Review the fixes applied to address all Greptile review findings

Todo List

  • Review the commits that addressed Greptile findings
  • Verify P1 fixes for non-optional line fields
  • Verify P1 fixes for DbError handling
  • Verify P2 fixes for resource leaks
  • Verify P2 fixes for table check error handling
  • Provide final assessment

✅ All Greptile Findings Successfully Addressed

P1 Fix Verified: ✅ Non-optional line fields corrected

  • All three structs now correctly use Option<i32> for line fields:
    • NativeNodeIdRow.line: Option<i32>
    • NativeRelatedNodeRow.line: Option<i32>
    • NativeAdjacentEdgeRow.line: Option<i32>
  • TypeScript converters properly handle null values with ?? 0 fallback:
    • toNodeIdRow: line: r.line ?? 0
    • toRelatedNodeRow: line: r.line ?? 0
    • toAdjacentEdgeRow: line: r.line ?? 0

P1 Fix Verified: ✅ DbError handling corrected

  • Line 348-350 in connection.ts: DbError instances are now properly re-thrown instead of being silently swallowed
  • Only native-engine failures (incompatible binary) trigger fallback to better-sqlite3
  • User-visible errors like "DB not found" are properly surfaced

P2 Fix Verified: ✅ Resource leak prevented

  • Lines 343-346 in connection.ts: Proper try/catch guards the ndb handle after openReadonly()
  • If any exception occurs after successful open, ndb.close() is called before re-throwing
  • Prevents file descriptor and rusqlite connection leaks

P2 Fix Verified: ✅ Table check error handling improved

  • All three methods now properly distinguish table-not-found from I/O errors:
    • has_cfg_tables: Matches on SqliteFailure for missing table, propagates other errors
    • has_embeddings: Matches on both SqliteFailure and QueryReturnedNoRows, propagates other errors
    • has_dataflow_table: Matches on SqliteFailure for missing table, propagates other errors
  • Genuine connection or I/O errors are now properly surfaced as napi errors

🎯 Assessment

Status: All critical review findings have been comprehensively addressed. The implementation now properly handles:

  • Runtime type safety for nullable database fields
  • Correct error propagation vs silent fallback behavior
  • Resource leak prevention
  • Proper error classification in table existence checks

The PR is ready for merge once CI confirms Rust compilation succeeds.


The nodes table allows NULL for line, but these three Rust structs
declared line as non-optional i32. Any node with a NULL line (e.g.
file-kind nodes) would trigger a rusqlite type-mismatch error. Change
to Option<i32> in Rust and number | null in TypeScript, with ?? 0
fallback in the converter functions to match existing Repository
interface contracts.
Re-throw DbError (e.g. DB not found) instead of silently falling back
to better-sqlite3, which would produce a misleading debug message.
Also wrap the ndb setup block in try/catch to close the handle if an
exception occurs after openReadonly, preventing resource leaks.
has_cfg_tables, has_embeddings, and has_dataflow_table previously caught
all rusqlite errors and returned false. Now only SqliteFailure (and
QueryReturnedNoRows for embeddings) map to false — genuine I/O or
connection errors propagate as napi errors.
@carlos-alm carlos-alm force-pushed the feat/native-read-queries branch from 3d337b7 to 0269b08 Compare March 29, 2026 01:26
@carlos-alm carlos-alm merged commit 37ec95f into main Mar 29, 2026
19 checks passed
@carlos-alm carlos-alm deleted the feat/native-read-queries branch March 29, 2026 02:21
@github-actions github-actions bot locked and limited conversation to collaborators Mar 29, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant