fix(global-discover): bucket codex sessions by originator by 0xDevNinja · Pull Request #1488 · garrytan/gstack · GitHub
Skip to content

fix(global-discover): bucket codex sessions by originator#1488

Open
0xDevNinja wants to merge 1 commit into
garrytan:mainfrom
0xDevNinja:fix/1315-codex-originator-and-cc-truncation
Open

fix(global-discover): bucket codex sessions by originator#1488
0xDevNinja wants to merge 1 commit into
garrytan:mainfrom
0xDevNinja:fix/1315-codex-originator-and-cc-truncation

Conversation

@0xDevNinja

Copy link
Copy Markdown
Contributor

Summary

Two patches from #1315 in one diff.

  1. Codex session bucketing. scanCodex now normalizes payload.originator into { desktop, exec, claude_code, other } and surfaces the breakdown at tools.codex.originators and per-repo codex_originators. Existing codex totals stay (additive — no consumer break).
  2. CC undercount. extractCwdFromJsonl reads 128KB instead of 8KB. Recent Claude Code / CCR JSONL files often open with a queue-operation event 30-50KB long that has no cwd — the old 8KB read truncated the line, JSON.parse failed, and the whole project dir was silently dropped. Same buffer size scanCodex already uses.

Fixes #1315.

Why

/retro global narrated "codex was the primary execution tool, 414 sessions across 7 repos" when codex actually drove dev for one repo's middle phase. The other ~309 codex_exec entries were CC firing codex as cross-model review subagent. A single bucket can't tell those apart.

For the CC count: @Akagilnc traced ~450 missing files in one repo's 31d window to the 8KB cap (issue thread). First-line queue-operation events are 30-50KB on recent CC versions; the parser never reached the later events that carry cwd.

Shape

// tools.codex now:
{
  "total_sessions": 414,
  "repos": 7,
  "originators": { "desktop": 92, "exec": 309, "claude_code": 13, "other": 0 }
}

// per repo:
{
  "name": "ak-ai-vela",
  "sessions": { "claude_code": 12, "codex": 98, "gemini": 0 },
  "codex_originators": { "desktop": 1, "exec": 97, "claude_code": 0, "other": 0 }
}

Summary format inline shows Codex:98 (desktop=1, exec=97, cc=0) per repo + a top-line Codex originators: ... rollup.

Originator normalization (case-insensitive, matches values observed in ~/.codex/sessions/):

  • "Codex Desktop" / "codex_desktop"desktop
  • "codex_exec" / "codex exec"exec
  • "Claude Code" / "claude_code"claude_code
  • anything else → other (not dropped — future originators land here visibly until we map them)

Also adds a CLAUDE_PROJECTS_DIR env override on scanClaudeCode so the CC regression test can stage a fake project dir; mirrors the existing CODEX_SESSIONS_DIR knob.

Out of scope

Problem 3 from the issue (annotate /retro global that "sessions = tool invocations / file count", not interactive dev) is a narrative-side change in the retro skill template. Reasonable to file separately.

Tests

7 new in test/global-discover.test.ts:

  • 'Codex Desktop' originator → desktop bucket.
  • 'codex_exec'exec bucket.
  • 'Claude Code'claude_code bucket.
  • Unknown originator string → other (verifies nothing is dropped).
  • Per-repo codex_originators sums to per-repo sessions.codex.
  • tools.codex.originators shape + total parity with tools.codex.total_sessions.
  • CC JSONL whose first line is a >30KB queue-operation event still resolves cwd from a later event.
bun test test/global-discover.test.ts
# 26 pass, 0 fail (19 existing + 7 new)

@Akagilnc

Copy link
Copy Markdown

Codex rollouts carry a free-form payload.originator string but global
discovery collapsed every codex session into one number, so /retro
global could not tell real interactive dev (Codex Desktop) from
subagent / scripted runs (codex_exec) or CC-driven calls (Claude Code).

Add a four-bucket classifier (desktop / exec / claude_code / other).
Unknown, missing, null, and non-string originators land in other rather
than being dropped, so the per-bucket counts always sum to the codex
total. Whitespace is trimmed before matching. Buckets surface in both
the JSON output (tools.codex.originators + per-repo codex_originators)
and the summary view.

Refs garrytan#1315.
@0xDevNinja 0xDevNinja force-pushed the fix/1315-codex-originator-and-cc-truncation branch from 69a2b46 to b729817 Compare June 16, 2026 06:35
@0xDevNinja 0xDevNinja changed the title fix(global-discover): bucket codex by originator + read 128KB for CC cwd fix(global-discover): bucket codex sessions by originator Jun 16, 2026
@0xDevNinja

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gstack-global-discover: session counts conflate originator types and undercount CC by ~5x

2 participants