Skip to content

perf(db): NativeDatabase napi-rs class for rusqlite connection lifecycle (6.13)#666

Merged
carlos-alm merged 4 commits intomainfrom
feat/native-database-class
Mar 28, 2026
Merged

perf(db): NativeDatabase napi-rs class for rusqlite connection lifecycle (6.13)#666
carlos-alm merged 4 commits intomainfrom
feat/native-database-class

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Adds NativeDatabase napi-rs class in crates/codegraph-core/src/native_db.rs holding a persistent rusqlite::Connection with factory methods (openReadWrite/openReadonly), lifecycle (close/exec/pragma), schema migrations (initSchema with all 16 migrations embedded), and build metadata KV (getBuildMeta/setBuildMeta)
  • Wires into the build pipeline: when native engine is available, NativeDatabase handles schema init and metadata reads/writes — better-sqlite3 remains open for queries and stages not yet migrated
  • Foundation for Phase 6.14+ which will migrate all query and write operations to rusqlite on the native path, eliminating dual-SQLite-in-one-process

Test plan

  • All 2129 tests pass (WASM engine path unchanged)
  • Biome lint clean on all changed files
  • Rust code passes rustfmt check
  • CI: cargo build compiles native_db.rs (local Windows linker env issue prevents local verification)
  • codegraph build --engine native uses NativeDatabase for schema+metadata
  • codegraph build --engine wasm unchanged behavior (falls back to better-sqlite3)
  • Incremental build: codegraph build twice — second is incremental, no schema errors

…fecycle (6.13)

Foundation for moving all DB operations to rusqlite on the native engine
path. Creates a persistent rusqlite::Connection holder exposed to JS,
handling schema migrations and build metadata KV — eliminating redundant
per-call connection open/close in the native build pipeline.
@claude
Copy link
Copy Markdown

claude bot commented Mar 28, 2026

Claude finished @carlos-alm's task —— View job


PR Review: NativeDatabase napi-rs Class Implementation

Reviewing this PR according to the codegraph review framework:

Progress

  • Phase 1: Root Cause Analysis
  • Phase 2: Configuration Change Detection
  • Phase 3: Test Integrity
  • Phase 4: Code Quality Review
  • Phase 5: Backlog Compliance Assessment
  • Final Assessment and Rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: This PR addresses the dual-SQLite-in-one-process inefficiency by creating a foundation for migrating all database operations to rusqlite on the native engine path. Currently, both the native Rust engine and JavaScript use separate SQLite connections (rusqlite + better-sqlite3), creating potential consistency issues and performance overhead.

Why This Approach: The PR implements a strategic foundation approach rather than a big-bang migration. It introduces NativeDatabase as a persistent rusqlite connection wrapper that handles schema initialization and build metadata operations, while keeping better-sqlite3 for queries that haven't been migrated yet. This allows for incremental migration in future phases.

Risk Assessment: Low risk - this is an additive change that maintains full backward compatibility. The native database is only used when available and gracefully falls back to the existing better-sqlite3 path on any errors.

Phase 2: Configuration Change Detection ✅

No High-Risk Configuration Changes Detected:

  • No Biome/linting rule modifications
  • No test timeout or coverage threshold changes
  • No build configuration modifications
  • No TypeScript strict mode changes
  • No package.json dependency additions that affect runtime

Phase 3: Test Integrity ✅

No Test Integrity Issues:

  • No tests were modified to accommodate the new NativeDatabase functionality
  • No test assertions were weakened or removed
  • No eslint-disable comments or test workarounds were added
  • The PR maintains the principle that existing tests should continue to pass with unchanged behavior

Phase 4: Code Quality Review ✅

Rust Code Quality (native_db.rs:1-565):

  • Security: ✅ All SQL operations use parameterized queries (params![]) preventing injection
  • Error Handling: ✅ Comprehensive error mapping from rusqlite to napi errors with descriptive messages
  • Memory Safety: ✅ Uses SendWrapper<Option<Connection>> for thread safety, proper connection lifecycle
  • Resource Management: ✅ Idempotent close() method, graceful fallback on errors

Schema Migration Mirroring (lines 21-267):

  • Critical Issue: ✅ Comment on line 11 explicitly states migrations are mirrored from src/db/migrations.ts
  • Verification: ✅ Manually compared migration DDL - exact match with TypeScript version
  • Legacy Compatibility: ✅ Handles missing columns from pre-migration DBs (lines 424-468)

TypeScript Integration (pipeline.ts:50-52, 78-89, 194-195):

  • Graceful Fallback: ✅ NativeDatabase usage is optional with proper error handling
  • Conditional Logic: ✅ Metadata reads route through NativeDatabase when available, fall back to better-sqlite3
  • Lifecycle Management: ✅ NativeDatabase closed before better-sqlite3

Interface Contract (types.ts):

  • Type Safety: ✅ Well-defined TypeScript interface matching Rust implementation
  • Consistency: ✅ Method signatures align with napi-rs exported functions

Minor Observations:

  • Migration DDL is duplicated between Rust and TypeScript - acceptable given the comment warning
  • No performance benchmarks for NativeDatabase vs better-sqlite3 operations - not required for foundation phase

Phase 5: Backlog Compliance Assessment ✅

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies. Builds entirely on existing rusqlite/napi-rs infrastructure that's already in the native engine path
  • Foundation-aligned: ✓ — Fully aligned with all core principles:
    • P1 (always-current): Improves rebuild speed by eliminating dual-SQLite inefficiency
    • P2 (native speed): Leverages existing native engine architecture
    • P4 (zero-cost core): No API calls, purely local SQLite operations
    • P5 (embeddable-first): Maintains programmatic API compatibility
  • Problem-fit: 4/5 — Indirectly addresses the core problem by improving build performance and laying groundwork for native query migration that will reduce agent round-trips
  • Breaking: No — Purely additive foundation change with graceful fallback
  • Tier: 1 (Zero-dep ✓ + Foundation-aligned ✓)

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Dual-SQLite-in-one-process inefficiency preventing clean native migration path
**Why This Approach**: Strategic foundation for incremental migration rather than risky big-bang rewrite  
**Risk Assessment**: Very low - additive with comprehensive fallback

## Backlog Compliance
- Zero-dep: ✓ — builds on existing napi-rs infrastructure
- Foundation-aligned: ✓ — improves P1 (always-current) and P2 (native speed)
- Problem-fit: 4 — foundation for future agent experience improvements  
- Breaking: No — purely additive with graceful fallback
- Tier: 1

## Critical Concerns
None. This is a well-architected foundation change that follows established patterns.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: High-quality foundation work that enables Phase 6.14+ native migration. Excellent error handling, security practices, and backward compatibility. The only reason this isn't 5/5 is that it's purely foundational - the user-visible benefits come in future phases that build on this infrastructure.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 28, 2026

Greptile Summary

This PR introduces NativeDatabase, a napi-rs–exposed rusqlite::Connection wrapper in crates/codegraph-core/src/native_db.rs, and wires it into the build pipeline as a Phase 6.13 foundation for replacing better-sqlite3 on the native engine path. When the native addon is available, NativeDatabase takes over schema initialization (initSchema with all 16 migrations embedded as Rust const DDL) and build-metadata reads/writes (getBuildMeta/setBuildMeta); better-sqlite3 remains open for queries and stages not yet migrated.

Key implementation points:

  • All three concerns raised in previous review rounds (untransactional multi-statement migrations, unrestricted write-PRAGMA surface, non-deterministic schema_version reads) are addressed: migrations now use conn.unchecked_transaction() per version, pragma() carries a doc-comment contract, and the version query uses ORDER BY rowid DESC LIMIT 1.
  • The TypeScript integration (pipeline.ts, finalize.ts) is cleanly dual-pathed: every metadata call falls back to getBuildMeta/setBuildMeta from better-sqlite3 when ctx.nativeDb is absent, and failure during native init is caught with a warn + fallback rather than a hard crash.
  • One inaccuracy remains: the open_read_write doc comment promises to create parent directories, but the implementation does not call std::fs::create_dir_all; in the current pipeline this is harmless because openDb always runs first, but the contract is false and could cause confusion for standalone callers in Phase 6.14+.

Confidence Score: 4/5

Safe to merge for Phase 6.13 scope; the single remaining finding is a misleading doc comment, not a runtime defect

All prior P1 concerns (migration atomicity, pragma safety, schema_version determinism) are resolved. The one remaining finding is a doc comment on open_read_write that claims parent-directory creation but doesn't implement it — a P2 documentation inaccuracy. In production the directory always pre-exists (openDb runs first), so no runtime failure is expected. Score is 4 rather than 5 only to flag the comment fix before this API is used standalone in 6.14+.

crates/codegraph-core/src/native_db.rs — doc comment on open_read_write should either add std::fs::create_dir_all or remove the directory-creation promise

Important Files Changed

Filename Overview
crates/codegraph-core/src/native_db.rs New 581-line file implementing NativeDatabase as a napi-rs class; prior review concerns (untransactional migrations, write-PRAGMA exposure, duplicate schema_version rows) are resolved; one remaining inaccuracy: doc comment on open_read_write promises directory creation that the implementation does not perform
src/domain/graph/builder/pipeline.ts Wires NativeDatabase into setupPipeline for schema init and checkEngineSchemaMismatch for metadata reads; falls back to better-sqlite3 on native load failure; error-path cleanup closes nativeDb before ctx.db
src/domain/graph/builder/stages/finalize.ts Routes getBuildMeta/setBuildMeta through nativeDb when available; closes nativeDb before better-sqlite3 at end of finalize; string conversion for node_count/edge_count is correctly applied only in the nativeDb path
src/types.ts Adds NativeDatabase instance interface and NativeAddon factory shape; interface correctly separates static factory methods from instance methods
src/db/migrations.ts Adds mirroring comment pointing to native_db.rs; no logic changes
src/domain/graph/builder/context.ts Adds optional nativeDb field to PipelineContext; NativeDatabase import added from types
crates/codegraph-core/src/lib.rs Registers native_db module; trivial one-line change

Sequence Diagram

sequenceDiagram
    participant P as pipeline.ts (setupPipeline)
    participant BS as better-sqlite3 (ctx.db)
    participant NDB as NativeDatabase (ctx.nativeDb)
    participant F as finalize.ts

    P->>BS: openDb(dbPath) — always opened
    P->>NDB: NativeDatabase.openReadWrite(dbPath) [native only]
    alt native available
        NDB-->>P: connection opened
        P->>NDB: initSchema() — runs all 16 migrations
    else native unavailable or error
        P->>BS: initSchema(ctx.db) — JS fallback
    end

    Note over P,F: Pipeline stages run (queries still via better-sqlite3)

    F->>NDB: getBuildMeta('node_count') [native]
    F->>NDB: getBuildMeta('edge_count') [native]
    F->>NDB: setBuildMeta([engine, schema_version, built_at, ...]) [native]

    F->>NDB: close() — NativeDatabase closed first
    F->>BS: closeDb(ctx.db) — better-sqlite3 closed second
Loading

Reviews (2): Last reviewed commit: "fix(build): remove redundant String() co..." | Re-trigger Greptile

Comment on lines +420 to +437

// Legacy column compat — add columns that may be missing from pre-migration DBs.
// Mirrors the post-migration block in src/db/migrations.ts initSchema().
if has_table(conn, "nodes") {
if !has_column(conn, "nodes", "end_line") {
let _ = conn.execute_batch("ALTER TABLE nodes ADD COLUMN end_line INTEGER");
}
if !has_column(conn, "nodes", "role") {
let _ = conn.execute_batch("ALTER TABLE nodes ADD COLUMN role TEXT");
}
let _ = conn.execute_batch("CREATE INDEX IF NOT EXISTS idx_nodes_role ON nodes(role)");
if !has_column(conn, "nodes", "parent_id") {
let _ = conn.execute_batch(
"ALTER TABLE nodes ADD COLUMN parent_id INTEGER REFERENCES nodes(id)",
);
}
let _ = conn
.execute_batch("CREATE INDEX IF NOT EXISTS idx_nodes_parent ON nodes(parent_id)");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Non-transactional multi-statement migrations can permanently corrupt the DB

Migrations that contain multiple ALTER TABLE statements (e.g. v9 with 14 ALTER TABLE statements, v11, v14, and v15) are executed via conn.execute_batch(migration.up) without an enclosing transaction. When SQLite is not inside an explicit transaction, every statement is individually auto-committed.

If the process crashes or rusqlite returns an error partway through migration v9 (e.g. after 7 of 14 ALTER TABLE statements), those 7 columns are already committed to disk but schema_version is still 8. The next run will attempt to re-apply all 14 ALTER TABLE statements and will fail immediately with "duplicate column name: loc" (or whichever column was already added). This leaves the database permanently unrunnable — the user must delete the entire .codegraph/graph.db file.

The TypeScript counterpart (db.exec(migration.up)) has the same flaw, but this is a brand-new Rust implementation and the fix is straightforward with rusqlite:

for migration in MIGRATIONS {
    if migration.version > current_version {
        let tx = conn
            .unchecked_transaction()
            .map_err(|e| napi::Error::from_reason(format!("begin migration tx failed: {e}")))?;
        tx.execute_batch(migration.up).map_err(|e| {
            napi::Error::from_reason(format!("migration v{} failed: {e}", migration.version))
        })?;
        tx.execute(
            "UPDATE schema_version SET version = ?1",
            params![migration.version],
        )
        .map_err(|e| {
            napi::Error::from_reason(format!("update schema_version failed: {e}"))
        })?;
        tx.commit().map_err(|e| {
            napi::Error::from_reason(format!("commit migration v{} failed: {e}", migration.version))
        })?;
        current_version = migration.version;
    }
}

Note: SQLite DDL is fully transactional (unlike PostgreSQL), so wrapping ALTER TABLE + UPDATE schema_version in a single transaction is both correct and idempotent on retry.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — each migration is now wrapped in an explicit conn.unchecked_transaction() / tx.commit() block, so DDL + schema_version update are atomic. A crash mid-migration rolls back cleanly and the next run retries the same version.

Comment on lines +374 to +393
}
}

/// Run all schema migrations. Mirrors `initSchema()` from `src/db/migrations.ts`.
#[napi]
pub fn init_schema(&self) -> napi::Result<()> {
let conn = self.conn()?;

conn.execute_batch(
"CREATE TABLE IF NOT EXISTS schema_version (version INTEGER NOT NULL DEFAULT 0)",
)
.map_err(|e| napi::Error::from_reason(format!("create schema_version failed: {e}")))?;

let mut current_version: u32 = conn
.query_row("SELECT version FROM schema_version", [], |row| row.get(0))
.unwrap_or(0);

// Insert version 0 if table was just created (empty)
let count: u32 = conn
.query_row("SELECT COUNT(*) FROM schema_version", [], |row| row.get(0))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 pragma() accepts write PRAGMAs without restriction

The sql argument is directly interpolated into PRAGMA {sql} and passed to conn.prepare(). While prepare() prevents multi-statement injection (SQLite rejects statements with a semicolon), it still allows callers to pass write-mode PRAGMAs such as:

nativeDb.pragma('journal_mode = DELETE')   // switches off WAL for both connections
nativeDb.pragma('wal_autocheckpoint = 0')  // disables WAL checkpointing
nativeDb.pragma('user_version = 99')       // corrupts schema version signal

Since this mirrors the better-sqlite3 pragma API and the callers are all trusted internal code today, this is not an immediate security issue. But as more callers are added in Phase 6.14+, a write-PRAGMA that silently downgrades journal_mode while ctx.db (better-sqlite3) is open on the same file could cause subtle corruption or lock errors.

Consider adding a guard that rejects setter syntax:

if sql.contains('=') || sql.to_ascii_lowercase().contains("checkpoint") {
    return Err(napi::Error::from_reason(
        "pragma(): use exec() for write PRAGMAs",
    ));
}

Or at minimum document the write-semantics expectation in the method's doc-comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added doc-comment clarifying that pragma() is intended for read-only PRAGMAs and that write-mode PRAGMAs should use exec() instead. Since all callers are trusted internal code today, a runtime guard would add overhead without benefit, but the documentation will prevent misuse as more callers are added in Phase 6.14+.

Comment on lines +400 to +418
}

for migration in MIGRATIONS {
if migration.version > current_version {
conn.execute_batch(migration.up).map_err(|e| {
napi::Error::from_reason(format!(
"migration v{} failed: {e}",
migration.version
))
})?;
conn.execute(
"UPDATE schema_version SET version = ?1",
params![migration.version],
)
.map_err(|e| {
napi::Error::from_reason(format!("update schema_version failed: {e}"))
})?;
current_version = migration.version;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 schema_version table has no UNIQUE constraint and allows multiple rows

The schema_version table is created as:

CREATE TABLE IF NOT EXISTS schema_version (version INTEGER NOT NULL DEFAULT 0)

There is no PRIMARY KEY or UNIQUE constraint, so multiple rows can accumulate. The migration loop uses:

conn.query_row("SELECT version FROM schema_version", [], |row| row.get(0))
    .unwrap_or(0);

which returns an arbitrary row if more than one exists (SQLite does not guarantee row order without ORDER BY). In a healthy DB there will only ever be one row, but a partial run that inserted before crashing could create duplicates.

The TypeScript version has the same issue, but a minimal guard would help:

conn.query_row(
    "SELECT version FROM schema_version ORDER BY rowid DESC LIMIT 1",
    [],
    |row| row.get(0),
)
.unwrap_or(0);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — the schema_version query now uses ORDER BY rowid DESC LIMIT 1 to deterministically return the latest version even if duplicate rows exist.

…chema (#666)

- Fix E0507 in close(): use (*self.conn).take() to deref through SendWrapper
  and call Option::take instead of SendWrapper::take which consumes self
- Wrap each migration in an explicit transaction so a mid-migration crash
  rolls back cleanly instead of leaving the DB in an unrecoverable state
- Add ORDER BY rowid DESC LIMIT 1 to schema_version query to handle
  potential duplicate rows deterministically
- Document pragma() as read-only with guidance to use exec() for write PRAGMAs
All values in the metadata object are already strings (explicitly coerced
via String() in the object literal), so the outer .map() String() wrapper
was a no-op.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed all Greptile review feedback:

  • P1 — non-transactional migrations: Each migration is now wrapped in conn.unchecked_transaction() / tx.commit() so DDL + version update are atomic. A crash mid-migration rolls back cleanly.
  • P2 — schema_version uniqueness: Query now uses ORDER BY rowid DESC LIMIT 1 for deterministic reads.
  • P2 — pragma() write semantics: Added doc-comment documenting read-only intent and pointing to exec() for write PRAGMAs.
  • P2 — redundant String() in finalize.ts: Removed the outer .map(... String(value)) since all values are already coerced to strings in the object literal.
  • Compilation fix (E0507): close() now uses (*self.conn).take() to call Option::take through SendWrapper::DerefMut instead of SendWrapper::take which consumes self.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@claude

@claude
Copy link
Copy Markdown

claude bot commented Mar 28, 2026

Claude finished @carlos-alm's task —— View job


PR Review: NativeDatabase Implementation After Greptile Fixes

Todo List

  • Read and analyze the core NativeDatabase implementation
  • Verify Greptile's P1 migration atomicity fix
  • Check P2 fixes (schema_version uniqueness, pragma semantics, redundant String())
  • Review TypeScript integration and error handling
  • Assess overall architecture and code quality
  • Provide final review recommendations

Review Summary

All Greptile Issues Successfully Addressed ✅

P1 Migration Atomicity Fix: ✅ VERIFIED
Lines 412-434 in native_db.rs now properly wrap each migration in conn.unchecked_transaction() with tx.commit(). This ensures DDL + schema_version update are atomic. If a process crashes mid-migration, the entire transaction rolls back cleanly.

P2 Schema Version Uniqueness: ✅ VERIFIED
Line 393 now uses ORDER BY rowid DESC LIMIT 1 for deterministic version reads, handling potential duplicate rows gracefully.

P2 Pragma Write Semantics: ✅ VERIFIED
Lines 356-358 added clear doc-comments explaining that pragma() is for read-only PRAGMAs and write-mode operations should use exec() instead.

P2 Redundant String(): ✅ VERIFIED
Line 88 in finalize.ts correctly removes the redundant .map(... String(value)) since all values are already coerced to strings in the object literal.

Architecture & Code Quality Assessment

Security & Safety: ✅ EXCELLENT

  • All SQL operations use parameterized queries (params![]) preventing injection
  • Comprehensive error handling with descriptive messages mapped from rusqlite to napi
  • SendWrapper<Option<Connection>> ensures thread safety
  • Idempotent close() method prevents double-free issues

Resource Management: ✅ ROBUST

  • Proper connection lifecycle with graceful fallback on errors
  • Both connections (rusqlite + better-sqlite3) can safely operate on the same DB in WAL mode
  • Clean error cleanup paths in pipeline.ts:194-198

TypeScript Integration: ✅ EXCELLENT

  • types.ts:1882-1885: Factory methods properly typed as static class methods
  • types.ts:1897-1906: Interface matches Rust implementation exactly
  • pipeline.ts:106: Proper error handling with graceful fallback to better-sqlite3
  • pipeline.ts:51-52: Consistent metadata routing pattern with null-safe operations
  • finalize.ts:78-90: Clean conditional logic for dual-engine support

Schema Migration Integrity: ✅ VERIFIED

  • Lines 21-267: Migration DDL exactly mirrors src/db/migrations.ts
  • Line 7-8 comment explicitly warns about keeping them in sync
  • Legacy column compatibility (lines 440-486) matches TypeScript version

Performance Design: ✅ STRATEGIC

  • Foundation approach allows incremental migration without big-bang risk
  • Maintains full backward compatibility during transition
  • WAL mode + NO_MUTEX flags optimized for single-threaded Node.js usage

Minor Observations

  • Migration DDL duplication between Rust/TypeScript is acceptable given explicit warning comment
  • No performance benchmarks provided, but this is foundational work where user-visible benefits come in Phase 6.14+

Final Recommendation

Rating: ⭐⭐⭐⭐⭐ (5/5) - APPROVE

This is exemplary foundation work. All critical issues identified by Greptile have been properly resolved. The implementation demonstrates excellent security practices, robust error handling, and thoughtful architectural design that enables future native migration phases. The code quality is high, TypeScript integration is seamless, and the dual-engine fallback strategy ensures zero risk to existing functionality.

Ready to merge - this provides a solid foundation for Phase 6.14+ native database migration while maintaining complete backward compatibility.


@carlos-alm carlos-alm merged commit 8974514 into main Mar 28, 2026
18 checks passed
@carlos-alm carlos-alm deleted the feat/native-database-class branch March 28, 2026 22:33
@github-actions github-actions bot locked and limited conversation to collaborators Mar 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant