perf(db): NativeDatabase napi-rs class for rusqlite connection lifecycle (6.13) by carlos-alm · Pull Request #666 · optave/ops-codegraph-tool · GitHub
Skip to content

perf(db): NativeDatabase napi-rs class for rusqlite connection lifecycle (6.13)#666

Merged
carlos-alm merged 4 commits into
mainfrom
feat/native-database-class
Mar 28, 2026
Merged

perf(db): NativeDatabase napi-rs class for rusqlite connection lifecycle (6.13)#666
carlos-alm merged 4 commits into
mainfrom
feat/native-database-class

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

Summary

  • Adds NativeDatabase napi-rs class in crates/codegraph-core/src/native_db.rs holding a persistent rusqlite::Connection with factory methods (openReadWrite/openReadonly), lifecycle (close/exec/pragma), schema migrations (initSchema with all 16 migrations embedded), and build metadata KV (getBuildMeta/setBuildMeta)
  • Wires into the build pipeline: when native engine is available, NativeDatabase handles schema init and metadata reads/writes — better-sqlite3 remains open for queries and stages not yet migrated
  • Foundation for Phase 6.14+ which will migrate all query and write operations to rusqlite on the native path, eliminating dual-SQLite-in-one-process

Test plan

  • All 2129 tests pass (WASM engine path unchanged)
  • Biome lint clean on all changed files
  • Rust code passes rustfmt check
  • CI: cargo build compiles native_db.rs (local Windows linker env issue prevents local verification)
  • codegraph build --engine native uses NativeDatabase for schema+metadata
  • codegraph build --engine wasm unchanged behavior (falls back to better-sqlite3)
  • Incremental build: codegraph build twice — second is incremental, no schema errors

…fecycle (6.13)

Foundation for moving all DB operations to rusqlite on the native engine
path. Creates a persistent rusqlite::Connection holder exposed to JS,
handling schema migrations and build metadata KV — eliminating redundant
per-call connection open/close in the native build pipeline.
@claude

claude Bot commented Mar 28, 2026

Copy link
Copy Markdown

@greptile-apps

greptile-apps Bot commented Mar 28, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces NativeDatabase, a napi-rs–exposed rusqlite::Connection wrapper in crates/codegraph-core/src/native_db.rs, and wires it into the build pipeline as a Phase 6.13 foundation for replacing better-sqlite3 on the native engine path. When the native addon is available, NativeDatabase takes over schema initialization (initSchema with all 16 migrations embedded as Rust const DDL) and build-metadata reads/writes (getBuildMeta/setBuildMeta); better-sqlite3 remains open for queries and stages not yet migrated.

Key implementation points:

  • All three concerns raised in previous review rounds (untransactional multi-statement migrations, unrestricted write-PRAGMA surface, non-deterministic schema_version reads) are addressed: migrations now use conn.unchecked_transaction() per version, pragma() carries a doc-comment contract, and the version query uses ORDER BY rowid DESC LIMIT 1.
  • The TypeScript integration (pipeline.ts, finalize.ts) is cleanly dual-pathed: every metadata call falls back to getBuildMeta/setBuildMeta from better-sqlite3 when ctx.nativeDb is absent, and failure during native init is caught with a warn + fallback rather than a hard crash.
  • One inaccuracy remains: the open_read_write doc comment promises to create parent directories, but the implementation does not call std::fs::create_dir_all; in the current pipeline this is harmless because openDb always runs first, but the contract is false and could cause confusion for standalone callers in Phase 6.14+.

Confidence Score: 4/5

Safe to merge for Phase 6.13 scope; the single remaining finding is a misleading doc comment, not a runtime defect

All prior P1 concerns (migration atomicity, pragma safety, schema_version determinism) are resolved. The one remaining finding is a doc comment on open_read_write that claims parent-directory creation but doesn't implement it — a P2 documentation inaccuracy. In production the directory always pre-exists (openDb runs first), so no runtime failure is expected. Score is 4 rather than 5 only to flag the comment fix before this API is used standalone in 6.14+.

crates/codegraph-core/src/native_db.rs — doc comment on open_read_write should either add std::fs::create_dir_all or remove the directory-creation promise

Important Files Changed

Filename Overview
crates/codegraph-core/src/native_db.rs New 581-line file implementing NativeDatabase as a napi-rs class; prior review concerns (untransactional migrations, write-PRAGMA exposure, duplicate schema_version rows) are resolved; one remaining inaccuracy: doc comment on open_read_write promises directory creation that the implementation does not perform
src/domain/graph/builder/pipeline.ts Wires NativeDatabase into setupPipeline for schema init and checkEngineSchemaMismatch for metadata reads; falls back to better-sqlite3 on native load failure; error-path cleanup closes nativeDb before ctx.db
src/domain/graph/builder/stages/finalize.ts Routes getBuildMeta/setBuildMeta through nativeDb when available; closes nativeDb before better-sqlite3 at end of finalize; string conversion for node_count/edge_count is correctly applied only in the nativeDb path
src/types.ts Adds NativeDatabase instance interface and NativeAddon factory shape; interface correctly separates static factory methods from instance methods
src/db/migrations.ts Adds mirroring comment pointing to native_db.rs; no logic changes
src/domain/graph/builder/context.ts Adds optional nativeDb field to PipelineContext; NativeDatabase import added from types
crates/codegraph-core/src/lib.rs Registers native_db module; trivial one-line change

Sequence Diagram

sequenceDiagram
    participant P as pipeline.ts (setupPipeline)
    participant BS as better-sqlite3 (ctx.db)
    participant NDB as NativeDatabase (ctx.nativeDb)
    participant F as finalize.ts

    P->>BS: openDb(dbPath) — always opened
    P->>NDB: NativeDatabase.openReadWrite(dbPath) [native only]
    alt native available
        NDB-->>P: connection opened
        P->>NDB: initSchema() — runs all 16 migrations
    else native unavailable or error
        P->>BS: initSchema(ctx.db) — JS fallback
    end

    Note over P,F: Pipeline stages run (queries still via better-sqlite3)

    F->>NDB: getBuildMeta('node_count') [native]
    F->>NDB: getBuildMeta('edge_count') [native]
    F->>NDB: setBuildMeta([engine, schema_version, built_at, ...]) [native]

    F->>NDB: close() — NativeDatabase closed first
    F->>BS: closeDb(ctx.db) — better-sqlite3 closed second
Loading

Reviews (2): Last reviewed commit: "fix(build): remove redundant String() co..." | Re-trigger Greptile

Comment on lines +420 to +437

// Legacy column compat — add columns that may be missing from pre-migration DBs.
// Mirrors the post-migration block in src/db/migrations.ts initSchema().
if has_table(conn, "nodes") {
if !has_column(conn, "nodes", "end_line") {
let _ = conn.execute_batch("ALTER TABLE nodes ADD COLUMN end_line INTEGER");
}
if !has_column(conn, "nodes", "role") {
let _ = conn.execute_batch("ALTER TABLE nodes ADD COLUMN role TEXT");
}
let _ = conn.execute_batch("CREATE INDEX IF NOT EXISTS idx_nodes_role ON nodes(role)");
if !has_column(conn, "nodes", "parent_id") {
let _ = conn.execute_batch(
"ALTER TABLE nodes ADD COLUMN parent_id INTEGER REFERENCES nodes(id)",
);
}
let _ = conn
.execute_batch("CREATE INDEX IF NOT EXISTS idx_nodes_parent ON nodes(parent_id)");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Non-transactional multi-statement migrations can permanently corrupt the DB

Migrations that contain multiple ALTER TABLE statements (e.g. v9 with 14 ALTER TABLE statements, v11, v14, and v15) are executed via conn.execute_batch(migration.up) without an enclosing transaction. When SQLite is not inside an explicit transaction, every statement is individually auto-committed.

If the process crashes or rusqlite returns an error partway through migration v9 (e.g. after 7 of 14 ALTER TABLE statements), those 7 columns are already committed to disk but schema_version is still 8. The next run will attempt to re-apply all 14 ALTER TABLE statements and will fail immediately with "duplicate column name: loc" (or whichever column was already added). This leaves the database permanently unrunnable — the user must delete the entire .codegraph/graph.db file.

The TypeScript counterpart (db.exec(migration.up)) has the same flaw, but this is a brand-new Rust implementation and the fix is straightforward with rusqlite:

for migration in MIGRATIONS {
    if migration.version > current_version {
        let tx = conn
            .unchecked_transaction()
            .map_err(|e| napi::Error::from_reason(format!("begin migration tx failed: {e}")))?;
        tx.execute_batch(migration.up).map_err(|e| {
            napi::Error::from_reason(format!("migration v{} failed: {e}", migration.version))
        })?;
        tx.execute(
            "UPDATE schema_version SET version = ?1",
            params![migration.version],
        )
        .map_err(|e| {
            napi::Error::from_reason(format!("update schema_version failed: {e}"))
        })?;
        tx.commit().map_err(|e| {
            napi::Error::from_reason(format!("commit migration v{} failed: {e}", migration.version))
        })?;
        current_version = migration.version;
    }
}

Note: SQLite DDL is fully transactional (unlike PostgreSQL), so wrapping ALTER TABLE + UPDATE schema_version in a single transaction is both correct and idempotent on retry.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — each migration is now wrapped in an explicit conn.unchecked_transaction() / tx.commit() block, so DDL + schema_version update are atomic. A crash mid-migration rolls back cleanly and the next run retries the same version.

Comment on lines +374 to +393
}
}

/// Run all schema migrations. Mirrors `initSchema()` from `src/db/migrations.ts`.
#[napi]
pub fn init_schema(&self) -> napi::Result<()> {
let conn = self.conn()?;

conn.execute_batch(
"CREATE TABLE IF NOT EXISTS schema_version (version INTEGER NOT NULL DEFAULT 0)",
)
.map_err(|e| napi::Error::from_reason(format!("create schema_version failed: {e}")))?;

let mut current_version: u32 = conn
.query_row("SELECT version FROM schema_version", [], |row| row.get(0))
.unwrap_or(0);

// Insert version 0 if table was just created (empty)
let count: u32 = conn
.query_row("SELECT COUNT(*) FROM schema_version", [], |row| row.get(0))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 pragma() accepts write PRAGMAs without restriction

The sql argument is directly interpolated into PRAGMA {sql} and passed to conn.prepare(). While prepare() prevents multi-statement injection (SQLite rejects statements with a semicolon), it still allows callers to pass write-mode PRAGMAs such as:

nativeDb.pragma('journal_mode = DELETE')   // switches off WAL for both connections
nativeDb.pragma('wal_autocheckpoint = 0')  // disables WAL checkpointing
nativeDb.pragma('user_version = 99')       // corrupts schema version signal

Since this mirrors the better-sqlite3 pragma API and the callers are all trusted internal code today, this is not an immediate security issue. But as more callers are added in Phase 6.14+, a write-PRAGMA that silently downgrades journal_mode while ctx.db (better-sqlite3) is open on the same file could cause subtle corruption or lock errors.

Consider adding a guard that rejects setter syntax:

if sql.contains('=') || sql.to_ascii_lowercase().contains("checkpoint") {
    return Err(napi::Error::from_reason(
        "pragma(): use exec() for write PRAGMAs",
    ));
}

Or at minimum document the write-semantics expectation in the method's doc-comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added doc-comment clarifying that pragma() is intended for read-only PRAGMAs and that write-mode PRAGMAs should use exec() instead. Since all callers are trusted internal code today, a runtime guard would add overhead without benefit, but the documentation will prevent misuse as more callers are added in Phase 6.14+.

Comment on lines +400 to +418
}

for migration in MIGRATIONS {
if migration.version > current_version {
conn.execute_batch(migration.up).map_err(|e| {
napi::Error::from_reason(format!(
"migration v{} failed: {e}",
migration.version
))
})?;
conn.execute(
"UPDATE schema_version SET version = ?1",
params![migration.version],
)
.map_err(|e| {
napi::Error::from_reason(format!("update schema_version failed: {e}"))
})?;
current_version = migration.version;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 schema_version table has no UNIQUE constraint and allows multiple rows

The schema_version table is created as:

CREATE TABLE IF NOT EXISTS schema_version (version INTEGER NOT NULL DEFAULT 0)

There is no PRIMARY KEY or UNIQUE constraint, so multiple rows can accumulate. The migration loop uses:

conn.query_row("SELECT version FROM schema_version", [], |row| row.get(0))
    .unwrap_or(0);

which returns an arbitrary row if more than one exists (SQLite does not guarantee row order without ORDER BY). In a healthy DB there will only ever be one row, but a partial run that inserted before crashing could create duplicates.

The TypeScript version has the same issue, but a minimal guard would help:

conn.query_row(
    "SELECT version FROM schema_version ORDER BY rowid DESC LIMIT 1",
    [],
    |row| row.get(0),
)
.unwrap_or(0);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — the schema_version query now uses ORDER BY rowid DESC LIMIT 1 to deterministically return the latest version even if duplicate rows exist.

…chema (#666)

- Fix E0507 in close(): use (*self.conn).take() to deref through SendWrapper
  and call Option::take instead of SendWrapper::take which consumes self
- Wrap each migration in an explicit transaction so a mid-migration crash
  rolls back cleanly instead of leaving the DB in an unrecoverable state
- Add ORDER BY rowid DESC LIMIT 1 to schema_version query to handle
  potential duplicate rows deterministically
- Document pragma() as read-only with guidance to use exec() for write PRAGMAs
All values in the metadata object are already strings (explicitly coerced
via String() in the object literal), so the outer .map() String() wrapper
was a no-op.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed all Greptile review feedback:

  • P1 — non-transactional migrations: Each migration is now wrapped in conn.unchecked_transaction() / tx.commit() so DDL + version update are atomic. A crash mid-migration rolls back cleanly.
  • P2 — schema_version uniqueness: Query now uses ORDER BY rowid DESC LIMIT 1 for deterministic reads.
  • P2 — pragma() write semantics: Added doc-comment documenting read-only intent and pointing to exec() for write PRAGMAs.
  • P2 — redundant String() in finalize.ts: Removed the outer .map(... String(value)) since all values are already coerced to strings in the object literal.
  • Compilation fix (E0507): close() now uses (*self.conn).take() to call Option::take through SendWrapper::DerefMut instead of SendWrapper::take which consumes self.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@claude

@claude

claude Bot commented Mar 28, 2026

Copy link
Copy Markdown

@carlos-alm carlos-alm merged commit 8974514 into main Mar 28, 2026
18 checks passed
@carlos-alm carlos-alm deleted the feat/native-database-class branch March 28, 2026 22:33
@github-actions github-actions Bot locked and limited conversation to collaborators Mar 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant