fix(codex): deduplicate copied branch history by OWConnoi · Pull Request #989 · ccusage/ccusage · GitHub
Skip to content

fix(codex): deduplicate copied branch history#989

Merged
ryoppippi merged 1 commit into
ccusage:mainfrom
OWConnoi:codex/dedupe-codex-branch-history
May 17, 2026
Merged

fix(codex): deduplicate copied branch history#989
ryoppippi merged 1 commit into
ccusage:mainfrom
OWConnoi:codex/dedupe-codex-branch-history

Conversation

@OWConnoi

@OWConnoi OWConnoi commented May 12, 2026

Copy link
Copy Markdown
Contributor

Fixes duplicated Codex Desktop usage when a branch/forked conversation copies historical JSONL token events into another session file.

What changed:

  • Adds a path-independent token event fingerprint in @ccusage/codex so copied historical token_count events are counted once across session files.
  • Keeps each file’s cumulative total tracking intact before dedupe, so new events in the branched session still produce the correct delta.
  • Adds a regression test that loads parent + branch session files with copied history and verifies only the new branch delta is counted.

Verification:

  • npx -y pnpm@10.30.1 --filter @ccusage/codex test src/data-loader.ts
  • npx -y pnpm@10.30.1 --filter @ccusage/codex typecheck
  • npx -y pnpm@10.30.1 run lint --fix from apps/codex
  • git diff --check

Closes #988.

Summary by CodeRabbit

  • Bug Fixes

    • Token usage events are now deterministically deduplicated so duplicate entries from copied or branched session histories are removed, yielding more accurate usage reports.
  • Tests

    • Test suite extended with scenarios simulating multi-session/branched histories to validate deduplication and ensure only unique events are returned.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown

@han-cheng6

han-cheng6 commented May 13, 2026

Copy link
Copy Markdown

Thanks a lot for the incredibly quick turnaround on this.

From reading the PR, the fix looks aligned with the issue I reported in #988: Codex Desktop branched conversations should not cause historical usage to be counted again.

From the reporter side, this looks like the right direction. Happy to see this merged once CI is green and the maintainers are comfortable with it.

@ryoppippi ryoppippi force-pushed the codex/dedupe-codex-branch-history branch from c5a2e1f to 54abc64 Compare May 17, 2026 01:52
@ryoppippi

Copy link
Copy Markdown
Member

Maintainer check after rebasing this onto current main:

  • Moved the fix to the current Codex adapter path after perf(ccusage): unify agent adapter foundations #1004: apps/ccusage/src/adapter/codex/parser.ts.
  • Confirmed the copied-branch-history regression test is red on main and green after the change.
  • Checked local Codex Desktop data. Before dedupe: 77,426 parsed token events, 2,722 duplicate events by timestamp/model/token fingerprint, with +287,576,105 totalTokens overcount. After the fix: loaded events have 0 duplicate fingerprints. The exact event count moved slightly while testing because local Codex logs are active, but the duplicate class is gone.

Validation run locally:

  • pnpm vitest run apps/ccusage/src/adapter/codex/parser.ts -t "deduplicates copied branch history"
  • pnpm run format
  • pnpm typecheck
  • pnpm run test

Decision: merge once bot checks finish. This one is directly reproduced by local real data and matches #988.

@pkg-pr-new

pkg-pr-new Bot commented May 17, 2026

Copy link
Copy Markdown

Open in StackBlitz

@ccusage/amp

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/amp@989

ccusage

npx https://pkg.pr.new/ryoppippi/ccusage@989

@ccusage/codex

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/codex@989

@ccusage/opencode

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/opencode@989

@ccusage/pi

npx https://pkg.pr.new/ryoppippi/ccusage/@ccusage/pi@989

commit: 6a8c825

@ryoppippi ryoppippi force-pushed the codex/dedupe-codex-branch-history branch from 54abc64 to a263355 Compare May 17, 2026 01:58
@ryoppippi

Copy link
Copy Markdown
Member

Follow-up: replaced the initial JSON.stringify-based fingerprint with a fixed-separator string key.

Local Codex data timing for loadTokenUsageEvents(), 5 runs each:

  • main checkout: about 2.00s average
  • this PR after the key change: about 2.00s average

So the previous CI large-fixture slowdown was likely from JSON.stringify allocation/serialization overhead. The branch has been force-pushed and CI is running again.

Deduplicate Codex token usage events with a session-independent fingerprint so branched or repeated session files do not count copied history more than once.

Add regression coverage for copied branch history and validate against local Codex logs, where the current parser produced thousands of duplicate token events.
@ryoppippi ryoppippi force-pushed the codex/dedupe-codex-branch-history branch from a263355 to 6a8c825 Compare May 17, 2026 02:07
@ryoppippi

Copy link
Copy Markdown
Member

Updated after merging #1013 into main and rebasing this PR.

Head SHA: 6a8c825
Base includes: b438b4d (#1013)

Local validation after rebase:

  • pnpm vitest run apps/ccusage/src/adapter/codex/parser.ts -t "deduplicates copied branch history"

The perf workflow should now always write results to the Actions job summary and attempt the PR comment best-effort.

@ryoppippi ryoppippi merged commit f53bbb7 into ccusage:main May 17, 2026
15 checks passed
@ryoppippi

Copy link
Copy Markdown
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@ccusage/codex double-counts tokens for branched Codex Desktop conversations

3 participants