fix(opencode): honor --since/--until in SQLite loader and JSON dump scan#1188
fix(opencode): honor --since/--until in SQLite loader and JSON dump scan#1188justi wants to merge 1 commit into
Conversation
|
This PR was auto-closed. Only contributors approved with Maintainers review auto-closed issues and reopen worthwhile ones. Issues that do not meet the quality bar in CONTRIBUTING.md may not be reopened or receive a reply. If a maintainer replies See CONTRIBUTING.md. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds date-range filtering to the OpenCode loader in ChangesOpenCode Loader Date-Range Filtering
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Thanks for the careful writeup — the motivation is solid and the SQL path is well reasoned. One correctness concern and a couple of robustness/nit items before this is safe to merge. ContextThe authoritative date filter is ✅ SQL
|
|
@justi heads up — #1220 (now closed) tackled the same issue with a different JSON-side strategy that I think is worth pulling into this PR. Instead of skipping JSON files by file mtime, it extracts the real Your SQLite
That keeps the speedup while removing the silent under-count risk. Closing #1220 so this stays the single canonical PR — would you be up for folding that JSON approach in here? Happy to point you at the exact |
|
Following up on your suggestion to fold #1220's JSON strategy into this PR. I've combined both halves as you outlined:
I also addressed the prepare-failure fallback: a schema without Verified on a real 33 GB / 395k-row Would you be willing to reopen this (or drop an |
|
@justi plz! go ahead |
Push date bounds into the OpenCode loader instead of reading every DB row and JSON file on each invocation, which hangs on large installs. - SQLite: indexed `WHERE time_created` push-down with -1d/+2d slack, falling back to an unfiltered scan when the schema lacks the column (the in-loop date check still applies, so no row is silently dropped). - JSON files: extract `time.created` from the raw payload and apply the same `format_date_tz` check as the authoritative `filter_loaded_entries_by_date`, failing open to a full parse so no in-range row is ever dropped. Combines the indexed SQL push-down from ccusage#1188 with the content-based JSON extraction from ccusage#1220, per maintainer guidance. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
e36d426 to
24b7abe
Compare

Summary
ccusage opencode(and the aggregateccusage) reads every row from the OpenCode SQLitemessagetable on every invocation, regardless of--since/--until, and then re-reads every JSON file understorage/message/*.json. On long-lived OpenCode installs (especially those migrated from the pre-SQLite layout) the loader can hang for minutes or appear to lock up.SharedArgs.since/SharedArgs.untilwere already plumbed into the adapter (ccusage-cli/src/types.rs:34-35), but the loader did not use them.Reproduction (local data, before this PR)
On
main @ 9d90e1b, opencode storage with:opencode.db: 33.6 GB, 392,340 rows inmessagestorage/message/: 118,619 JSON files (legacy pre-SQLite dump, ~2.0 GB)$ ccusage opencode --since 2026-05-04 --until 2026-05-10 # never returns; the loader scans 392k DB rows + parses 118k JSON filesRaw SQL with
WHERE time_created BETWEEN ? AND ?against the same DB returns in <50 ms.Fix
rust/crates/ccusage/src/adapter/opencode/loader.rs:SQLite query uses the existing index. The
messagetable already hasCREATE INDEX message_session_time_created_id_idx ON message (session_id, time_created, id). Whenshared.since/shared.untilare set, the prepared statement now addsWHERE time_created >= ?1 AND time_created < ?2(or open-ended variants).storage/message/*.jsonloop skips by mtime. Files outside the inferred range are skipped viafs::metadata(...).modified()before any read/parse.Slack. since/until are inflated by
-1/+2days before being applied, so the existing string-based summary filter (summary.rs:265-271) remains authoritative; the SQL/mtime checks only short-circuit work that the summary would have discarded anyway. This absorbs timezone offsets and any FAT32-style 2-second mtime rounding.Performance on local data
--since 2026-05-04 --until 2026-05-10)main @ 9d90e1bWHEREonlyWHERE+ mtime-skipLoaded tokens are identical across variants and match a hand-written
sqlite3query against the same DB: 414,644 input / 4,646 output.Cross-OS compatibility
Runtime:
std::fs::metadata().modified()works on macOS APFS/HFS+, Linux ext4/btrfs/xfs, and Windows NTFS. FAT32's 2-second mtime granularity is covered by the slack.Tests: new tests use the
filetimecrate (FileTime::from_unix_time) which maps toutimensat(Linux),setattrlist(macOS), andSetFileTime(Windows). Same code path runs on all three CI targets.Tests added
4 new tests in
adapter::opencode::loader::tests:since_filter_drops_db_rows_older_than_lower_bounduntil_filter_drops_db_rows_at_or_after_upper_boundsince_filter_skips_json_files_with_older_mtimeno_since_until_keeps_all_json_files_regardless_of_mtimeAll 285 workspace tests pass;
cargo clippy --workspace --all-targets -- -D warningsis clean;cargo fmt --all -- --checkis clean.Why this is a fresh narrow PR
The same user-visible gap was reported in #801, requested in #867, and previously fixed against the now-removed TypeScript
@ccusage/opencodepackage in #960 (closed with: "If a gap remains, it needs a fresh narrow PR against main"). This PR is that narrow PR against the current Rust adapter. No CLI surface, docs, or configuration schema changes.Summary by cubic
Make
ccusage opencodehonor--since/--untilfor both SQLite and JSON, avoiding full scans and cutting runtime on large installs from minutes to seconds.WHERE time_createdbounds (open-ended variants) with bound params; fall back to an unfiltered scan if the column is missing, then apply a per-row date check.time.createdfrom raw content and skip out-of-range files before full parse; works with pretty-printed files and fails open to a full parse if extraction fails.Written for commit 24b7abe. Summary will update on new commits.
Summary by CodeRabbit
New Features
Improvements
Tests