{{ message }}
feat(shell): stream_output out-of-turn (PR 2/2) — stacked on #25825#25834
Draft
pmenic wants to merge 21 commits intogoogle-gemini:mainfrom
Draft
feat(shell): stream_output out-of-turn (PR 2/2) — stacked on #25825#25834pmenic wants to merge 21 commits intogoogle-gemini:mainfrom
pmenic wants to merge 21 commits intogoogle-gemini:mainfrom
Conversation
Splits incoming text chunks into complete lines for use by the shell tool's upcoming stream_output path. Handles CRLF / LF, preserves lone CR (so progress-bar redraws don't fragment), caps emitted lines at 64 KiB with a truncation marker, and discards the rest of an over-sized line until the next newline to avoid emitting many fragments. Infrastructure-only: no caller yet.
Adds an optional boolean field to the shell tool's TypeScript params interface. No runtime behavior change yet; subsequent commits will wire the schema declaration, the ACP updateOutput bridge, and the background stdout listener. Field is documented as active only in combination with is_background.
Adds SHELL_PARAM_STREAM_OUTPUT to base-declarations and wires it into getShellDeclaration() so the model sees the new parameter in the tool's JSON schema. Description makes clear it is effective only together with is_background, sends each stdout line as an incremental update, and that the stream is scoped to the current turn (full output still reachable via read_background_output). Schema snapshots refreshed. No execution-path change yet; subsequent commits wire the ACP bridge and the background stdout listener.
Previously acpClient.runTool() called invocation.execute({ abortSignal })
with no updateOutput callback, so any incremental output from tools was
silently dropped before reaching the ACP client.
Now runTool() builds a closure over the current callId and displayTitle
that emits a tool_call_update(status: 'in_progress') for each string
update. Non-string live outputs (AnsiOutput, SubagentProgress, empty
strings) are filtered out to avoid garbage on the wire.
This is the plumbing the upcoming stream_output path depends on, and as
a side effect enables live progress streaming for the shell tool's
existing foreground updateOutput callers (e.g. binary-output notices,
periodic throttled progress).
…ll tool Adds an optional numeric field on ToolResult that a tool can use to tell the ACP layer "I'm handing off to a long-running background stream; hold the tool call in in_progress rather than emitting completed immediately." The shell tool populates it with the child pid when its early-return background branch fires AND stream_output is true. When stream_output is absent or false, the field stays undefined and the existing "Background process started with PID X" flow is unchanged. The ACP wrapper does not react to this field yet; the next commit wires it to defer the terminal tool_call_update until the process actually exits or the turn is aborted.
When a tool returns `backgroundedStreamId`, the ACP wrapper now emits an
in_progress update instead of the terminal completed one and registers
two cleanup paths:
1. ExecutionLifecycleService.onBackgroundComplete — fires when the
real process exits (error flag routes to failed vs completed).
2. abortSignal 'abort' — fires when the turn is cancelled.
First path wins; the other is deregistered to avoid leaks. This keeps
the ACP tool call open so subsequent streamed stdout lines can keep
emitting in_progress updates against the same toolCallId during the
turn, without ever violating the "no updates after terminal status"
contract.
The chat history still records the tool as completed immediately with
the "Command is running in background. PID: X" response, so the model
sees the tool call as done on its next turn. Only the ACP protocol view
is held open.
When stream_output is set together with is_background, the shell tool now subscribes to ExecutionLifecycleService for the spawned pid before the 200ms backgrounding delay fires — guaranteeing no early chunks are lost. Every stdout chunk is fed through a LineBuffer that preserves partial lines across chunk boundaries; completed lines are forwarded to the caller's updateOutput callback (which the ACP wrapper turns into tool_call_update events, per the previous commit). The subscriber auto-tears-down on the exit event and on abortSignal, flushing any trailing partial line via LineBuffer.flush() so nothing is dropped. When stream_output is false or the caller provided no updateOutput, no subscription happens and behavior is byte-identical to before. Batching / debouncing is intentionally NOT added here — the next commit introduces the 200ms batch window.
Background stdout bursts (e.g. a watcher firing ten NEW: events at once) would otherwise turn into ten separate tool_call_update ACP notifications in the same millisecond. Insert a single-shot 200ms timer between the LineBuffer and updateOutput; lines arriving inside the window are joined with \n and emitted as one update when the timer fires. Abort and process-exit teardown paths force an immediate flush so the final partial line and any pending batch are never lost. STREAM_OUTPUT_BATCH_MS matches Claude Code's Monitor-tool reference.
Adds two regression tests covering the two expected teardown paths for the stream_output listener: 1. abort: an AbortController.abort() mid-stream must flush any pending batched line IMMEDIATELY (no waiting out the 200ms window) and unsubscribe so later emits are ignored. 2. exit: when the process emits its exit event, any trailing partial line buffered in the LineBuffer (no terminating \n) must still be delivered via updateOutput — otherwise the final fragment is lost. Both paths were already implemented; these tests lock the behavior against regression.
…d_output
Extends the Background Tools integration test with a case that spawns a
real node subprocess emitting three lines 50ms apart, subscribes to
ExecutionLifecycleService the way shell.ts's stream_output path does,
and asserts:
1. The three lines arrive in order via the live subscriber,
line-buffered across arbitrary chunk boundaries.
2. read_background_output against the same pid still returns all
three lines — confirming our live subscriber is a PARALLEL
observer of ExecutionLifecycleService and does not consume or
interfere with the disk-log WriteStream that backs
read_background_output.
Uses the child_process path (shouldUseNodePty: false) because
stream_output only forwards string chunks today; PTY-mode AnsiOutput
fragments are intentionally skipped by the LineBuffer path.
Adds the stream_output argument to the Technical reference table and a
new "Streaming background output" section covering:
- When it takes effect (requires is_background: true)
- Real-world example matching ConsultaSkill / CI-watching use cases
- 200ms batching window with flush-on-abort / flush-on-exit semantics
- Coexistence with read_background_output (parallel observer, not a
consumer; disk log is unchanged)
- Line buffering and 64 KiB per-line cap
- Turn-scoped limitation (live stream ends at turn end; process keeps
running) plus a forward reference that async-on-idle is a follow-up
- PTY-mode AnsiOutput is skipped; pipe through grep --line-buffered to
force plain-text lines if needed
Standalone tsx-runnable script that spawns a real background node process emitting 5 lines one second apart, subscribes to ExecutionLifecycleService exactly the way shell.ts's stream_output path does, and logs each line as it arrives with a timestamp — so reviewers can *see* the live line forwarding without needing to spin up a full ACP client. Run from repo root: npx tsx scripts/demo-stream-output.ts Expected output is five 'live: line<N>' messages ~1s apart, followed by the exit event. This validates the subscribe + LineBuffer + exit cleanup path end-to-end on a real child process, complementing the mocked unit tests and the disk-log-coexistence integration test.
The original implementation only set ToolResult.backgroundedStreamId on the early-return path (shell.ts line ~625 "Command is running in background" block). That path only fires when the resolved result promise has NOT yet marked `completed = true` by the time the BACKGROUND_DELAY_MS check runs. In the real ShellExecutionService.background() call path (and the default test mock), background() synchronously resolves the result promise with `backgrounded: true`, so `completed` is already true at the check and the tool falls through to `await resultPromise` → the "Command moved to background (PID: N). Output hidden. Press Ctrl+B to view." branch. That branch was NOT setting backgroundedStreamId, which meant in production the ACP wrapper emitted the terminal `completed` status before any streamed lines could land — the whole stream_output feature silently did nothing. Detected via an end-to-end test through Zed: the ACP wiretap showed every backgrounded run emit a tool_call_update(completed) immediately, with text "Command moved to background..." and no subsequent tool_call_update(in_progress) carrying streamed stdout. Fix: add a local `backgroundedStreamId` variable, set it to `result.pid` inside the existing `(this.params.is_background || result.backgrounded)` branch when stream_output is true, and include it in both final return statements (normal + summarize). Adds a regression test for the fall-through path with the exact "Command moved to background..." message.
node-pty emits AnsiOutput terminal-redraw snapshots rather than decoded strings, so the stream_output LineBuffer — which only processes string chunks — received nothing on Windows where enableInteractiveShell is commonly set to true. When both is_background and stream_output are true, explicitly pass shouldUseNodePty: false to ShellExecutionService.execute() regardless of the user's interactive-shell setting. Background processes cannot accept interactive input anyway, and the child_process path delivers the same stdout content as plain decoded strings — exactly what the LineBuffer-driven forwarder expects. The user's enableInteractiveShell setting is otherwise untouched. Discovered via Zed end-to-end test: the wiretap showed tool_call_update(in_progress) with backgroundedStreamId now firing (a previous commit's win) but no per-line streamed updates arriving, because every data event carried an AnsiOutput chunk that our listener silently skipped.
ExecutionLifecycleService notifies our stream_output subscribe listener (shell.ts teardownStream, synchronously) and the onBackgroundComplete listener (this file, synchronously) from the same process-exit notification. The former flushes a final tool_call_update(in_progress) through updateOutput->sendUpdate (async); the latter was emitting tool_call_update(status: completed) immediately — meaning the two async sendUpdate calls race, and the terminal status could land before the final streamed line. Strict ACP clients may drop updates arriving after a terminal status, so we now setImmediate() the terminal emit inside the completion listener. This yields one macrotask, letting any pending stream flush propagate first. Verified via Zed end-to-end test with a bash command emitting 5 lines one second apart: previously the 5th line landed ~1 ms after the completed event; with this fix the 5 in_progress updates always precede the terminal completed.
Addresses the four high-priority findings from the Gemini Code Assist bot review on google-gemini#25825: 1. `LineBuffer` truncation split multi-unit Unicode characters (emoji, astral plane) because it used UTF-16 `.length` / `.slice`. Rename `maxLineBytes` → `maxLineCodePoints` and slice via `Array.from(...)`. Truncation now guarantees the emitted prefix is a valid Unicode string — no stray high-surrogates. Two new regression tests exercise the case for both the newline-present and buffer- overflow code paths. 2. The `updateOutput` closure in `acpClient.ts` used `void this.sendUpdate(...)` which silently swallows rejections. Replace with `.catch(err => debugLogger.error(...))` so transport / serialization errors surface and don't become unhandled-rejection process crashes. 3. Same fix for the terminal `sendUpdate` inside `emitTerminal` (the backgroundedStreamId path). Rejections here would previously have dropped a `completed` event unnoticed, leaving the tool call stuck `in_progress` client-side. Ordering on the wire remains preserved by Node's stdout stream serializing writes in insertion order; no additional promise-chain is required. Sequencing of incremental-vs-terminal updates is already guarded by the existing `setImmediate` in the completion listener. All 21 LineBuffer tests + 67 shell tests + 65 acpClient tests pass.
…ndaries
Both shell.ts (the LineBuffer-driven forwarder) and acpClient.ts (the
terminal status emitter) used to tear down a stream_output stream as
soon as the spawning turn's abortSignal fired. That meant a new user
prompt — which aborts the previous turn — would prematurely close a
background stream whose process is still producing output the model
may want to react to in a future turn.
PR 2 removes those turn-abort teardown paths:
- shell.ts no longer registers `signal.addEventListener('abort', …)`
for stream_output mode. The subscribe listener tears down only on
the process's own `exit` event.
- acpClient.ts runTool no longer registers the abort handler that
fired the terminal tool_call_update(completed). Terminal status
now fires exclusively on onBackgroundComplete (real process exit).
Session-close cleanup will land in a follow-up commit.
This is the minimum change to support the ConsultaSkill use case
(agent reacts to background file-watcher events in a later turn): the
sessionUpdate() notification API is already documented to work outside
an active prompt and Zed's AcpThread tolerates out-of-turn updates,
so no ACP protocol extensions are required.
Test "flushes pending stream_output lines and unsubscribes on abort"
— which locked in the PR 1 turn-end teardown — is replaced by
"stream_output listener outlives the spawning turn abort" which
verifies: lines emitted AFTER abort still propagate; only the 'exit'
event finally tears the listener down.
Extends ExecuteOptions.updateOutput with an optional second `_meta`
argument that the ACP layer forwards verbatim to the tool_call_update
`_meta` field. ACP spec explicitly reserves `_meta` for
client/agent-specific metadata (types.gen.d.ts: "MUST NOT make
assumptions about values at these keys" + vendor-prefix convention).
The shell tool's stream_output path now tags every streamed batch with
`{ 'gemini-cli/stream_output': true, pid: <number> }`. Sidecars
(e.g. the ConsultaSkill whisper-queue) can filter on this marker to
distinguish background-stream events from normal in-turn progress
updates (binary-detection notices, periodic throttled foreground
output), without parsing the text or the tool name.
Non-shell / non-stream-output updates continue to omit `_meta`
entirely — the field is strictly additive.
Replaces the "stream is tied to the current turn" bullet with the new PR 2 behavior: streams outlive the spawning turn and keep emitting in_progress tool_call_update events against the original toolCallId until the process actually exits. Documents the `_meta['gemini-cli/stream_output']` marker so sidecar authors know how to filter. Adds a new "Reacting automatically while the agent is idle" section that explains the ACP protocol limit (agents cannot initiate turns) and the recommended sidecar pattern (FIFO buffer + consolidated re-prompt on turn end) for closing the ConsultaSkill-style end-to-end loop without upstream protocol changes. Points at the existing acp-wiretap.js as a starting point.
Adds the missed regression test for PR 2's core semantic: when a tool returns backgroundedStreamId, the ACP wrapper must NOT emit a terminal tool_call_update on the turn's abortSignal. The test makes a tool return backgroundedStreamId, snapshots the updates emitted so far, yields a macrotask, and asserts zero new updates arrived — in particular, no status:'completed'. This locks in the "stream outlives the spawning turn" contract against future regressions that might restore the abortHandler wiring. Missed in the feat/stream-output-async commit sequence because I moved to the docs commit without git-adding this file.
2170b9b to
a7b347e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
Follow-up to #25825 (PR 1). Makes
stream_output: trueevents keep flowingafter the spawning turn ends, so a client-side sidecar can react to
background file-watcher events even when no user prompt is in flight — the
ConsultaSkill use case on top of issue #25803.
Details
Four new commits on top of #25825 (branch
pmenic:feat/stream-output-async):feat(stream_output): detach stream from turn abort — survive turn boundaries— removes the turn-scopedabortSignalteardown listenersfrom both
shell.ts(LineBuffer-driven forwarder) andacpClient.ts(terminal-status emitter). Teardown now happens exclusively on the
process's own
exitevent. A new user prompt (which aborts theprevious
pendingSend) no longer prematurely closes a stream whoseprocess is still producing output the model may want to react to
later.
feat(stream_output): tag tool_call_update with _meta marker—extends
ExecuteOptions.updateOutputwith an optional second_metaarg that the ACP layer forwards verbatim to
tool_call_update._meta. The shell tool's stream_output path tagsevery streamed batch with
{ 'gemini-cli/stream_output': true, pid: <number> }so sidecars canfilter the ACP wire precisely for stream events (vs binary-detection
notices, periodic throttled progress, etc.) without parsing text or
relying on tool-name heuristics. ACP spec explicitly reserves
_metafor exactly this
(
types.gen.d.ts:"MUST NOT make assumptions about values at these keys").
docs(shell): update stream_output docs for PR 2 out-of-turn semantics— documents the new lifecycle (_metamarker, streamsoutlive spawning turn, terminal emit only on real process exit) and
adds a "Reacting automatically while the agent is idle" section
explaining the ACP protocol limit (agents cannot initiate turns) and
the recommended client-side FIFO-queue sidecar pattern. No upstream
ACP extension required.
test(cli/acp): PR 2 regression — no terminal emit on turn abort— locks in the core PR 2 contract: a tool that returns
backgroundedStreamIdmust not get a terminaltool_call_updateonturn-abort. Test snapshots emitted updates, yields a macrotask, and
asserts zero new updates arrived.
Why no new ACP protocol work
Research against the ACP SDK and Zed's AcpThread implementation:
AgentSideConnection.sessionUpdate()is documented as a notificationendpoint for "real-time updates about session progress"
(
acp.d.ts:33-45) and the SDK imposes no runtime constraint requiringan active
prompt().AcpThreaddispatchesSessionUpdateevents asynchronouslythrough its channel-based event stream even outside active turns.
promptturn. Theprotocol is strictly client-driven. Closing the "agent reacts while
idle" loop therefore has to live on the client side (e.g. in a
sidecar process that sits between Zed and gemini-cli, observes the
wire, filters on
_meta['gemini-cli/stream_output'], buffers eventsin a FIFO queue while the agent is busy, and flushes them as a
consolidated
session/promptwhen idle).This PR is the agent-side half — the client-side sidecar lives outside
gemini-cli (e.g. in
pmenic/ConsultaSkill).Related Issues
Closes part of #25803 (together with #25825 this is the full plumbing
contribution from gemini-cli).
How to Validate
End-to-end verified via Zed 0.233.5 on Windows (2026-04-23):
TICK:1..30every 2 seconds(~60 seconds total) with
stream_output: truein turn 1.launchedand ended turn 1 (per explicit promptinstruction "do not read output or call any other tools").
still emitting.
scripts/acp-wiretap.jstapping the stdio between Zedand the local bundle) showed:
session/promptid=2 at line 23 (watcher launch)tool_call_update(status: in_progress)withcontent[0].content.text=TICK:Nand_meta: { "gemini-cli/stream_output": true, "pid": 23748 }session/promptid=3 at line 62 (haiku) — TICK:3..TICK:9 wereemitted AFTER this prompt arrived, referencing the ORIGINAL
toolCallId
session/promptid=4 at line 92 (haiku re-sent) — TICK:10..TICK:30continued emitting during this third turn
_metamarker.Plus all unit tests green:
Pre-Merge Checklist
docs/tools/shell.md1 new regression test (acpClient) + 1 new test (
updateOutput _metaforwarding)
stream_outputis opt-in, existingbehavior unchanged when absent or when
is_background: falsenpm run+ end-to-end Zed 30-TICK cross-turn test