Read-only database package for extensions · Issue #116 · git-pkgs/git-pkgs · GitHub
Skip to content

Read-only database package for extensions #116

Description

@andrew

Issue #31 extracted the analysis libraries (enrichment, vulns, manifests) as public Go modules but explicitly kept database and indexer internal, since they were considered CLI-specific. That decision predates extensions. Now that git-pkgs-foo binaries exist, extensions written in Go have no good way to read the dependency database.

Current options for an extension that needs the dependency list:

  1. Shell out to git pkgs list --format json -- works, but adds subprocess overhead, requires parsing text, and means the extension can't do anything the CLI doesn't already expose (like querying change history with custom filters or joining dependency data with enrichment cache).
  2. Open SQLite directly -- possible since the path is predictable (.git/pkgs.sqlite3), but couples to an undocumented schema that can change between versions. Schema v8 today, v9 tomorrow, and the extension breaks silently.

A public read-only query package would sit between these, living in the same module as git-pkgs itself rather than as a separate Go module. This way the read-only interface stays in sync with the schema automatically -- when the schema changes, the query package updates in the same commit. No coordinated releases across modules.

The split could look like:

  • github.com/git-pkgs/git-pkgs/database -- public, read-only query interface for extensions to import
  • internal/database -- migrations, writes, indexer support (stays internal)

The public package exposes a stable interface for the queries extensions actually need:

  • List current dependencies (what list returns)
  • Get dependency at a commit (what show uses)
  • Get change history for a package (what history uses)
  • Read cached enrichment data (packages, versions, vulnerability tables)
  • Schema version check so extensions can fail gracefully on mismatch

The package would open the database in read-only mode (?mode=ro), own no migrations, and make no writes. The core CLI keeps full control of the schema and write path. Extensions get typed access without reimplementing SQL queries or parsing CLI output.

This benefits #114 (MCP server) and #115 (LSP server) directly. Both could import the query package instead of shelling out to git pkgs for every request, which matters when the LSP is handling hover events or the MCP server is fielding rapid-fire agent queries.

The enrichment cache is relevant too. The database already stores registry metadata and vulnerability data with 24-hour TTLs. An extension that reads this cache avoids duplicate API calls to ecosyste.ms or OSV. Without the query package, extensions either re-fetch everything themselves or parse --format json output that doesn't include cached enrichment data.

Extensions already have a write path via git pkgs notes. An extension can store its results with a namespace (--namespace scorecard), origin tag (--origin git-pkgs-scorecard), and arbitrary key-value metadata (--set score=7.2 --set maintained=9). Other commands and extensions can read those notes back with git pkgs notes list --namespace scorecard --format json. This means the read-only query package only needs to handle reads -- extensions write through the CLI's notes interface, keeping write coordination simple.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions