fix(mothership): stop inlining full execution traces for the logs context#5353
Conversation
…text Tagging a run via "Troubleshoot in Chat" (or any @-mention of a logs context) resolved through processExecutionLogFromDb, which materialized the ENTIRE execution trace (every block's input/output, nested tool-call spans) and inlined it directly into the prompt. For any non-trivial run this repeatedly blew the context window, forcing multiple compactions and eventually auto-stopping the agent before it could investigate anything. Every other context resolver in this file already avoids this by sending a lightweight pointer instead of a full inline dump (workflow/blocks/ workflow_block contexts point into the VFS). Logs contexts have no VFS materialization to point at, but the equivalent lightweight mechanism already exists as a tool: query_logs supports incremental disclosure (overview for timing/cost, full for a scoped block's input/output, or pattern to grep the trace) and is already registered for the mothership agent. Now processExecutionLogFromDb sends a compact summary (id, workflow, level, trigger, timing, cost) plus a note pointing the model at query_logs with the executionId, instead of materializing and embedding the trace. Also drops the now-unused executionData column from the select projection, so resolving a logs context no longer fetches a potentially large JSONB blob it never reads.
PR SummaryMedium Risk Overview A 64KB cap drops the overview if the serialized summary is still too large. Workspace and read-permission checks are unchanged; new tests cover summary shape, redaction of raw I/O, size-cap behavior, and dropped contexts. Reviewed by Cursor Bugbot for commit dcf884e. Configure here. |
Greptile SummaryThis PR fixes a context-window exhaustion bug in the mothership copilot: the "Troubleshoot in Chat" /
Confidence Score: 5/5Safe to merge — the change is narrowly scoped to the logs context resolver, replaces an unbounded trace dump with a compact projection, and is covered by four targeted tests. The core fix is straightforward: toOverview strips all input/output data from the trace before it reaches the prompt, the 64 KB size cap guards against pathological span counts, and the existing query_logs tool provides a clean on-demand escape hatch. Authorization checks are unchanged. The only gap is a mismatch between what the PR description claims (that the executionData DB column was dropped) and what the code actually does (the column is still selected and used to build the overview), but this has no runtime impact. No files require special attention — both changed files are internally consistent and the tests exercise the critical paths. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant U as User (Chat)
participant PC as processExecutionLogFromDb
participant DB as Database
participant OS as Object Storage (optional)
participant LM as LLM Agent
U->>PC: "@logs context (executionId)"
PC->>DB: "SELECT id, workflowId, ..., executionData WHERE executionId=?"
DB-->>PC: log row (incl. executionData JSONB)
PC->>PC: authorizeWorkflowByWorkspacePermission
PC->>OS: materializeExecutionData (resolve pointer if offloaded)
OS-->>PC: "{ traceSpans[] }"
PC->>PC: toOverview(traceSpans) → compact tree (no input/output)
PC->>PC: "size cap: if JSON > 64KB, drop overview"
PC-->>LM: "compact summary { id, level, trigger, timing, cost, overview, note→query_logs }"
LM->>LM: call query_logs(executionId, view:'full', blockId) on demand
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant U as User (Chat)
participant PC as processExecutionLogFromDb
participant DB as Database
participant OS as Object Storage (optional)
participant LM as LLM Agent
U->>PC: "@logs context (executionId)"
PC->>DB: "SELECT id, workflowId, ..., executionData WHERE executionId=?"
DB-->>PC: log row (incl. executionData JSONB)
PC->>PC: authorizeWorkflowByWorkspacePermission
PC->>OS: materializeExecutionData (resolve pointer if offloaded)
OS-->>PC: "{ traceSpans[] }"
PC->>PC: toOverview(traceSpans) → compact tree (no input/output)
PC->>PC: "size cap: if JSON > 64KB, drop overview"
PC-->>LM: "compact summary { id, level, trigger, timing, cost, overview, note→query_logs }"
LM->>LM: call query_logs(executionId, view:'full', blockId) on demand
Reviews (2): Last reviewed commit: "improvement(mothership): send a bounded ..." | Re-trigger Greptile |
…are tool pointer Follow-up to the previous commit's fix (stop inlining full execution traces). A pure text pointer telling the model to call query_logs made the agent's very first useful action against a tagged run contingent on it noticing and correctly acting on prose in a JSON blob it may only skim — every sibling resolver in this file instead returns a deterministic mechanism (a VFS path) the model reads on demand. There's no VFS materialization for individual execution logs, but the same deterministic signal is available cheaply: toOverview() (the exact projection query_logs's own "overview" view already returns) walks the raw trace spans and produces a compact tree — block name/type/status/ timing/cost, no input or output — without touching large-value refs at all. The summary now includes that tree, so the model can see which block failed on the first turn, and the note narrows to what still requires a tool call: a block's actual input/output/error, or a grep. materializeExecutionData is still called, but it's a no-op for the common inline case (it only unwraps a top-level object-storage pointer for runs whose whole trace was offloaded as one blob) and was needed to reach traceSpans at all for those heavier runs — exactly the runs most worth an overview. A serialized-size cap (mirroring query-logs.ts's own truncation fallback, scaled down since this lands in the prompt unconditionally) drops the overview if a pathological span count pushes it over budget, falling back to the note alone. Extends the tests: the happy path now asserts the overview tree is present and that no raw input/output payload leaks into the serialized summary, plus a new test for the size-cap fallback.
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit dcf884e. Configure here.

Summary
logs@-mention) resolved the tagged run throughprocessExecutionLogFromDb, which materialized the entire execution trace — every block's input/output, nested tool-call spans — and inlined it directly into the prompt. For any non-trivial run this repeatedly blew the context window, forcing multiple "Context Compaction" cycles and eventually auto-stopping the agent before it investigated anything.process-contents.tsalready avoids this by sending a lightweight pointer instead of a full inline dump (workflow/blocks/workflow_blockpoint into the VFS). Logs contexts had no VFS materialization to point at, but the equivalent lightweight mechanism already exists as a tool:query_logssupports incremental disclosure (overviewfor the timing/cost tree,fullfor a scoped block's input/output,patternto grep the trace) and is already registered for the mothership agent — no new tool needed.processExecutionLogFromDbnow sends a compact summary (id, workflow, level, trigger, timing, cost) plus a note pointing the model atquery_logswith theexecutionId, instead of materializing and embedding the trace.executionDatacolumn from the select projection — resolving a logs context no longer fetches a potentially large JSONB blob it never reads.Type of Change
Testing
process-contents.test.ts: a tagged run resolves to a compact summary that explicitly does not containtraceSpans/errorDetails/executionData, and does contain aquery_logs+ executionId pointer; cross-workspace and unauthorized-access cases still drop the context as before.type-check,biome,check:client-boundary,check:api-validationall pass.Checklist