Allow adding/editing files not referenced by config#7275
Allow adding/editing files not referenced by config#7275
Conversation
d6c5834 to
4a901d3
Compare
132b322 to
8a20cf5
Compare
4a901d3 to
e68483f
Compare
217dc60 to
65af36b
Compare
2bb0d77 to
234904b
Compare
5ab0cc8 to
de96668
Compare
de96668 to
4d3a06b
Compare
234904b to
d383465
Compare
63ec7ab to
cf0aad6
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cf0aad6fb8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| qb.push_values(free_files, |mut b, (path, body)| { | ||
| let hash = blake3::hash(body.as_bytes()).as_bytes().to_vec(); | ||
| b.push_bind(Uuid::now_v7()) | ||
| .push_bind(path) | ||
| .push_bind(body) |
There was a problem hiding this comment.
Deduplicate free-file inserts before writing stored_files
write_free_files_in_tx unconditionally inserts every free file on every apply, but this path never reuses an existing (file_path, content_hash) row like write_collected_files does. Re-saving an unchanged editor state repeatedly will accumulate duplicate active rows for the same file path, and those rows are only tombstoned when the path is removed entirely, so table size and GET /internal/config_toml scan cost grow with save count.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
The CTE in write_free_files_in_tx already dedups on the editor's "content unchanged" path: its WHERE NOT EXISTS (SELECT 1 ... WHERE file_path = input.file_path AND content_hash = input.content_hash AND deleted_at IS NULL) clause skips the insert when an active row with the same (file_path, content_hash) already exists, so re-saving an unchanged editor state is a no-op on that path.
Content-change dedup is separately covered — the paired UPDATE ... FROM new_rows ... WHERE t.id != n.id tombstones any previously-active row for the same file_path, and write_collected_files (reached via extra_templates for free files too) now does the same tombstone step for referenced files (d177b54). Postgres regression test added in 0641cb8.
If there's a scenario you think is still doubled up, happy to look — but I don't see one in the current code.
There was a problem hiding this comment.
Add a deterministic tie-breaker to editor file selection
This DISTINCT ON (file_path) query orders only by created_at DESC, so when multiple active rows for the same path share a timestamp, Postgres may return either row nondeterministically. In this change, one apply can create multiple same-path rows in a single transaction, and NOW() is transaction-stable, so ties are realistic; if tied rows differ, editor path_contents (and therefore CAS signatures) can flip unpredictably across reads.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
we should reject multiple rows with the same path
There was a problem hiding this comment.
Done — load_editor_path_contents now returns an error (in b736742) instead of silently picking a row when stored_files has more than one active row for a file_path. Also dropped the DISTINCT ON (and the no-longer-needed created_at DESC column from idx_stored_files_editor_latest) — the write path is the single source of truth for maintaining the at-most-one-active-row invariant.
There was a problem hiding this comment.
Addressed — the DISTINCT ON is gone entirely. load_editor_path_contents now does a plain SELECT file_path, source_body WHERE deleted_at IS NULL and errors if any file_path appears more than once (b736742). The write path maintains the "at most one active row per file_path" invariant: free files via the CTE's paired INSERT … RETURNING + UPDATE tombstone, referenced files via write_collected_files's new tombstone step (d177b54). Pre-migration duplicates are backfilled (tombstoned) in 1d20517 so the invariant holds from day one. Postgres regression test in 0641cb8.
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit cf0aad6. Configure here.
f7be467 to
c3591e3
Compare
c3591e3 to
a1dcf7c
Compare
Per review: the write path maintains the invariant that at most one active row exists per file_path, so we should surface a duplicate as a data integrity error rather than silently picking a row via `DISTINCT ON` + tie-breaker. Drops the `created_at DESC` column from the partial index since the read no longer sorts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `load_editor_path_contents` integrity check added in b736742 assumes the write path maintains "at most one active row per `file_path`". The free-file CTE already maintains this, but `write_collected_files` (used for referenced files) inserted new rows on content change without tombstoning old ones, which is what caused CI failures once the read path started rejecting duplicates. Make `write_collected_files` maintain the invariant: - Filter existing-row lookup on `deleted_at IS NULL` so we never reuse a tombstoned UUID (which would leave the path with zero active rows). - After inserting new rows, tombstone any OTHER active rows for the processed `file_path`s. Config consumers read files by UUID without a `deleted_at` filter, so older function/tool/eval versions still resolve correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups from the review: 1. `write_free_files_in_tx` tombstones any active row whose `file_path` is not in `all_new_paths`. Previously `all_new_paths` only contained what the client sent in `request.path_contents`, so an apply that omitted a canonical file from `path_contents` would tombstone its still-referenced active row. Union the canonical paths into `all_new_paths` so canonical files are always protected regardless of what the client includes. 2. Add a Postgres sqlx test asserting that editing a tool's referenced file body through `write_stored_config` twice leaves exactly one active row for the path (old version tombstoned, new version carries the updated content). This locks in the `write_collected_files` tombstone fix from d177b54. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this migration, `write_collected_files` could leave multiple rows with the same `file_path` (when a referenced file's content was edited, a new row was inserted without tombstoning the old one). Once the `deleted_at` column is added all of those rows default to `deleted_at IS NULL`, which would break the new editor read path: it enforces "at most one active row per `file_path`" and now returns a data-integrity error rather than silently picking a row. Tombstone every non-latest row per `file_path` in the same migration, using `(created_at DESC, id DESC)` so the row we keep active matches what the previous `DISTINCT ON` read-time tie-breaker would have returned. The invariant holds from migration time forward and the editor GET doesn't 500 on any deployment that already had duplicates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 1d20517. Configure here.
Cursor review flagged the error message for using single-quoted `{:?}`
output around the file path and leaving `stored_files` / `file_path`
unquoted. Project convention prefers backticks for technical terms in
error messages and user-visible strings. Switch to backtick-wrapped
identifiers and `{}` formatting.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>


We add "deleted_at" to stored files so we can show all latest version of the same file that are not deleted, regardless of whether it's referenced in the config toml. This allows persisting files that are created but not yet referenced in the config toml (users may want to write a file, save it, then change the config to reference it).
Note
Medium Risk
Touches Postgres schema and write semantics for
stored_files, including new tombstoning behavior and editor reads; mistakes could hide files or break editor/config apply flows, but changes are scoped and covered by new e2e tests.Overview
Config editor can now persist and reload “free” files not yet referenced by TOML. The editor GET/apply flow now merges all active
stored_filesintopath_contents, computes CAS signatures over this merged view, and on apply persists unreferenced files while tombstoning removed ones.stored_filesgains soft-delete + “single active row per path” enforcement. Addsdeleted_atwith a migration that backfills by tombstoning superseded rows and creates a partial index for active lookups; the write path now only reuses active rows and tombstones other active versions perfile_path. Newload_editor_path_contentsreads non-tombstoned rows and errors on duplicates, with expanded Rust + Playwright e2e coverage for free-file add/edit/remove and referenced-file rewrites.Reviewed by Cursor Bugbot for commit a9a479e. Bugbot is set up for automated code reviews on this repo. Configure here.