feat(shell): stream_output out-of-turn (PR 2/2) — stacked on #25825 by pmenic · Pull Request #25834 · google-gemini/gemini-cli · GitHub
Skip to content

feat(shell): stream_output out-of-turn (PR 2/2) — stacked on #25825#25834

Draft
pmenic wants to merge 21 commits intogoogle-gemini:mainfrom
pmenic:feat/stream-output-async
Draft

feat(shell): stream_output out-of-turn (PR 2/2) — stacked on #25825#25834
pmenic wants to merge 21 commits intogoogle-gemini:mainfrom
pmenic:feat/stream-output-async

Conversation

@pmenic
Copy link
Copy Markdown

@pmenic pmenic commented Apr 22, 2026

Summary

Follow-up to #25825 (PR 1). Makes stream_output: true events keep flowing
after the spawning turn ends
, so a client-side sidecar can react to
background file-watcher events even when no user prompt is in flight — the
ConsultaSkill use case on top of issue #25803.

⚠️ Stacked on top of #25825. The diff visible here includes all of
PR 1's commits plus 4 new ones. Review only the commits listed in
"Details" below
, or wait until #25825 merges and I'll rebase this onto
main for a clean diff.

Details

Four new commits on top of #25825 (branch pmenic:feat/stream-output-async):

  1. feat(stream_output): detach stream from turn abort — survive turn boundaries — removes the turn-scoped abortSignal teardown listeners
    from both shell.ts (LineBuffer-driven forwarder) and acpClient.ts
    (terminal-status emitter). Teardown now happens exclusively on the
    process's own exit event. A new user prompt (which aborts the
    previous pendingSend) no longer prematurely closes a stream whose
    process is still producing output the model may want to react to
    later.

  2. feat(stream_output): tag tool_call_update with _meta marker
    extends ExecuteOptions.updateOutput with an optional second _meta
    arg that the ACP layer forwards verbatim to
    tool_call_update._meta. The shell tool's stream_output path tags
    every streamed batch with
    { 'gemini-cli/stream_output': true, pid: <number> } so sidecars can
    filter the ACP wire precisely for stream events (vs binary-detection
    notices, periodic throttled progress, etc.) without parsing text or
    relying on tool-name heuristics. ACP spec explicitly reserves _meta
    for exactly this
    (types.gen.d.ts:
    "MUST NOT make assumptions about values at these keys").

  3. docs(shell): update stream_output docs for PR 2 out-of-turn semantics — documents the new lifecycle (_meta marker, streams
    outlive spawning turn, terminal emit only on real process exit) and
    adds a "Reacting automatically while the agent is idle" section
    explaining the ACP protocol limit (agents cannot initiate turns) and
    the recommended client-side FIFO-queue sidecar pattern. No upstream
    ACP extension required.

  4. test(cli/acp): PR 2 regression — no terminal emit on turn abort
    — locks in the core PR 2 contract: a tool that returns
    backgroundedStreamId must not get a terminal tool_call_update on
    turn-abort. Test snapshots emitted updates, yields a macrotask, and
    asserts zero new updates arrived.

Why no new ACP protocol work

Research against the ACP SDK and Zed's AcpThread implementation:

  • AgentSideConnection.sessionUpdate() is documented as a notification
    endpoint
    for "real-time updates about session progress"
    (acp.d.ts:33-45) and the SDK imposes no runtime constraint requiring
    an active prompt().
  • Zed's AcpThread dispatches SessionUpdate events asynchronously
    through its channel-based event stream even outside active turns.
  • What ACP cannot do: agents cannot initiate a prompt turn. The
    protocol is strictly client-driven. Closing the "agent reacts while
    idle" loop therefore has to live on the client side (e.g. in a
    sidecar process that sits between Zed and gemini-cli, observes the
    wire, filters on _meta['gemini-cli/stream_output'], buffers events
    in a FIFO queue while the agent is busy, and flushes them as a
    consolidated session/prompt when idle).

This PR is the agent-side half — the client-side sidecar lives outside
gemini-cli (e.g. in pmenic/ConsultaSkill).

Related Issues

Closes part of #25803 (together with #25825 this is the full plumbing
contribution from gemini-cli).

How to Validate

End-to-end verified via Zed 0.233.5 on Windows (2026-04-23):

  1. Launched a PowerShell watcher emitting TICK:1..30 every 2 seconds
    (~60 seconds total) with stream_output: true in turn 1.
  2. Model replied launched and ended turn 1 (per explicit prompt
    instruction "do not read output or call any other tools").
  3. User sent a second prompt ("Tell me a haiku") while the process was
    still emitting.
  4. Wiretap log (scripts/acp-wiretap.js tapping the stdio between Zed
    and the local bundle) showed:
    • session/prompt id=2 at line 23 (watcher launch)
    • 30× tool_call_update(status: in_progress) with
      content[0].content.text = TICK:N and
      _meta: { "gemini-cli/stream_output": true, "pid": 23748 }
    • session/prompt id=3 at line 62 (haiku) — TICK:3..TICK:9 were
      emitted AFTER this prompt arrived, referencing the ORIGINAL
      toolCallId
    • session/prompt id=4 at line 92 (haiku re-sent) — TICK:10..TICK:30
      continued emitting during this third turn
    • Every event carried the _meta marker.

Plus all unit tests green:

npx vitest run packages/core/src/utils/lineBuffer.test.ts
npx vitest run packages/core/src/tools/shell.test.ts
npx vitest run packages/cli/src/acp/acpClient.test.ts
npx vitest run packages/core/src/tools/shellBackgroundTools.integration.test.ts

Pre-Merge Checklist

  • Updated relevant documentation — docs/tools/shell.md
  • Added/updated tests — 1 unit test adjusted for new PR 2 semantics +
    1 new regression test (acpClient) + 1 new test (updateOutput _meta
    forwarding)
  • Noted breaking changes — none; stream_output is opt-in, existing
    behavior unchanged when absent or when is_background: false
  • Validated on required platforms:
    • MacOS (not run locally)
    • Windows — npm run + end-to-end Zed 30-TICK cross-turn test
    • Linux (trusted to CI)

pmenic and others added 21 commits April 22, 2026 20:34
Splits incoming text chunks into complete lines for use by the shell tool's
upcoming stream_output path. Handles CRLF / LF, preserves lone CR (so
progress-bar redraws don't fragment), caps emitted lines at 64 KiB with a
truncation marker, and discards the rest of an over-sized line until the
next newline to avoid emitting many fragments.

Infrastructure-only: no caller yet.
Adds an optional boolean field to the shell tool's TypeScript params
interface. No runtime behavior change yet; subsequent commits will wire
the schema declaration, the ACP updateOutput bridge, and the background
stdout listener. Field is documented as active only in combination with
is_background.
Adds SHELL_PARAM_STREAM_OUTPUT to base-declarations and wires it into
getShellDeclaration() so the model sees the new parameter in the tool's
JSON schema. Description makes clear it is effective only together with
is_background, sends each stdout line as an incremental update, and that
the stream is scoped to the current turn (full output still reachable
via read_background_output). Schema snapshots refreshed.

No execution-path change yet; subsequent commits wire the ACP bridge and
the background stdout listener.
Previously acpClient.runTool() called invocation.execute({ abortSignal })
with no updateOutput callback, so any incremental output from tools was
silently dropped before reaching the ACP client.

Now runTool() builds a closure over the current callId and displayTitle
that emits a tool_call_update(status: 'in_progress') for each string
update. Non-string live outputs (AnsiOutput, SubagentProgress, empty
strings) are filtered out to avoid garbage on the wire.

This is the plumbing the upcoming stream_output path depends on, and as
a side effect enables live progress streaming for the shell tool's
existing foreground updateOutput callers (e.g. binary-output notices,
periodic throttled progress).
…ll tool

Adds an optional numeric field on ToolResult that a tool can use to tell
the ACP layer "I'm handing off to a long-running background stream; hold
the tool call in in_progress rather than emitting completed immediately."

The shell tool populates it with the child pid when its early-return
background branch fires AND stream_output is true. When stream_output is
absent or false, the field stays undefined and the existing
"Background process started with PID X" flow is unchanged.

The ACP wrapper does not react to this field yet; the next commit wires
it to defer the terminal tool_call_update until the process actually
exits or the turn is aborted.
When a tool returns `backgroundedStreamId`, the ACP wrapper now emits an
in_progress update instead of the terminal completed one and registers
two cleanup paths:

  1. ExecutionLifecycleService.onBackgroundComplete — fires when the
     real process exits (error flag routes to failed vs completed).
  2. abortSignal 'abort' — fires when the turn is cancelled.

First path wins; the other is deregistered to avoid leaks. This keeps
the ACP tool call open so subsequent streamed stdout lines can keep
emitting in_progress updates against the same toolCallId during the
turn, without ever violating the "no updates after terminal status"
contract.

The chat history still records the tool as completed immediately with
the "Command is running in background. PID: X" response, so the model
sees the tool call as done on its next turn. Only the ACP protocol view
is held open.
When stream_output is set together with is_background, the shell tool
now subscribes to ExecutionLifecycleService for the spawned pid before
the 200ms backgrounding delay fires — guaranteeing no early chunks are
lost. Every stdout chunk is fed through a LineBuffer that preserves
partial lines across chunk boundaries; completed lines are forwarded to
the caller's updateOutput callback (which the ACP wrapper turns into
tool_call_update events, per the previous commit).

The subscriber auto-tears-down on the exit event and on abortSignal,
flushing any trailing partial line via LineBuffer.flush() so nothing is
dropped. When stream_output is false or the caller provided no
updateOutput, no subscription happens and behavior is byte-identical to
before.

Batching / debouncing is intentionally NOT added here — the next commit
introduces the 200ms batch window.
Background stdout bursts (e.g. a watcher firing ten NEW: events at once)
would otherwise turn into ten separate tool_call_update ACP notifications
in the same millisecond. Insert a single-shot 200ms timer between the
LineBuffer and updateOutput; lines arriving inside the window are joined
with \n and emitted as one update when the timer fires.

Abort and process-exit teardown paths force an immediate flush so the
final partial line and any pending batch are never lost.

STREAM_OUTPUT_BATCH_MS matches Claude Code's Monitor-tool reference.
Adds two regression tests covering the two expected teardown paths for
the stream_output listener:

1. abort: an AbortController.abort() mid-stream must flush any pending
   batched line IMMEDIATELY (no waiting out the 200ms window) and
   unsubscribe so later emits are ignored.

2. exit: when the process emits its exit event, any trailing partial
   line buffered in the LineBuffer (no terminating \n) must still be
   delivered via updateOutput — otherwise the final fragment is lost.

Both paths were already implemented; these tests lock the behavior
against regression.
…d_output

Extends the Background Tools integration test with a case that spawns a
real node subprocess emitting three lines 50ms apart, subscribes to
ExecutionLifecycleService the way shell.ts's stream_output path does,
and asserts:

  1. The three lines arrive in order via the live subscriber,
     line-buffered across arbitrary chunk boundaries.
  2. read_background_output against the same pid still returns all
     three lines — confirming our live subscriber is a PARALLEL
     observer of ExecutionLifecycleService and does not consume or
     interfere with the disk-log WriteStream that backs
     read_background_output.

Uses the child_process path (shouldUseNodePty: false) because
stream_output only forwards string chunks today; PTY-mode AnsiOutput
fragments are intentionally skipped by the LineBuffer path.
Adds the stream_output argument to the Technical reference table and a
new "Streaming background output" section covering:

  - When it takes effect (requires is_background: true)
  - Real-world example matching ConsultaSkill / CI-watching use cases
  - 200ms batching window with flush-on-abort / flush-on-exit semantics
  - Coexistence with read_background_output (parallel observer, not a
    consumer; disk log is unchanged)
  - Line buffering and 64 KiB per-line cap
  - Turn-scoped limitation (live stream ends at turn end; process keeps
    running) plus a forward reference that async-on-idle is a follow-up
  - PTY-mode AnsiOutput is skipped; pipe through grep --line-buffered to
    force plain-text lines if needed
Standalone tsx-runnable script that spawns a real background node
process emitting 5 lines one second apart, subscribes to
ExecutionLifecycleService exactly the way shell.ts's stream_output
path does, and logs each line as it arrives with a timestamp — so
reviewers can *see* the live line forwarding without needing to spin
up a full ACP client.

Run from repo root:
  npx tsx scripts/demo-stream-output.ts

Expected output is five 'live: line<N>' messages ~1s apart, followed
by the exit event. This validates the subscribe + LineBuffer + exit
cleanup path end-to-end on a real child process, complementing the
mocked unit tests and the disk-log-coexistence integration test.
The original implementation only set ToolResult.backgroundedStreamId on
the early-return path (shell.ts line ~625 "Command is running in
background" block). That path only fires when the resolved result
promise has NOT yet marked `completed = true` by the time the
BACKGROUND_DELAY_MS check runs.

In the real ShellExecutionService.background() call path (and the
default test mock), background() synchronously resolves the result
promise with `backgrounded: true`, so `completed` is already true at the
check and the tool falls through to `await resultPromise` → the
"Command moved to background (PID: N). Output hidden. Press Ctrl+B to
view." branch. That branch was NOT setting backgroundedStreamId, which
meant in production the ACP wrapper emitted the terminal `completed`
status before any streamed lines could land — the whole stream_output
feature silently did nothing.

Detected via an end-to-end test through Zed: the ACP wiretap showed
every backgrounded run emit a tool_call_update(completed) immediately,
with text "Command moved to background..." and no subsequent
tool_call_update(in_progress) carrying streamed stdout.

Fix: add a local `backgroundedStreamId` variable, set it to
`result.pid` inside the existing `(this.params.is_background ||
result.backgrounded)` branch when stream_output is true, and include it
in both final return statements (normal + summarize). Adds a regression
test for the fall-through path with the exact "Command moved to
background..." message.
node-pty emits AnsiOutput terminal-redraw snapshots rather than decoded
strings, so the stream_output LineBuffer — which only processes string
chunks — received nothing on Windows where enableInteractiveShell is
commonly set to true.

When both is_background and stream_output are true, explicitly pass
shouldUseNodePty: false to ShellExecutionService.execute() regardless of
the user's interactive-shell setting. Background processes cannot accept
interactive input anyway, and the child_process path delivers the same
stdout content as plain decoded strings — exactly what the
LineBuffer-driven forwarder expects. The user's
enableInteractiveShell setting is otherwise untouched.

Discovered via Zed end-to-end test: the wiretap showed
tool_call_update(in_progress) with backgroundedStreamId now firing (a
previous commit's win) but no per-line streamed updates arriving,
because every data event carried an AnsiOutput chunk that our listener
silently skipped.
ExecutionLifecycleService notifies our stream_output subscribe listener
(shell.ts teardownStream, synchronously) and the onBackgroundComplete
listener (this file, synchronously) from the same process-exit
notification. The former flushes a final tool_call_update(in_progress)
through updateOutput->sendUpdate (async); the latter was emitting
tool_call_update(status: completed) immediately — meaning the two async
sendUpdate calls race, and the terminal status could land before the
final streamed line.

Strict ACP clients may drop updates arriving after a terminal status, so
we now setImmediate() the terminal emit inside the completion listener.
This yields one macrotask, letting any pending stream flush propagate
first.

Verified via Zed end-to-end test with a bash command emitting 5 lines
one second apart: previously the 5th line landed ~1 ms after the
completed event; with this fix the 5 in_progress updates always precede
the terminal completed.
Addresses the four high-priority findings from the Gemini Code Assist
bot review on google-gemini#25825:

1. `LineBuffer` truncation split multi-unit Unicode characters
   (emoji, astral plane) because it used UTF-16 `.length` / `.slice`.
   Rename `maxLineBytes` → `maxLineCodePoints` and slice via
   `Array.from(...)`. Truncation now guarantees the emitted prefix is a
   valid Unicode string — no stray high-surrogates. Two new regression
   tests exercise the case for both the newline-present and buffer-
   overflow code paths.

2. The `updateOutput` closure in `acpClient.ts` used
   `void this.sendUpdate(...)` which silently swallows rejections.
   Replace with `.catch(err => debugLogger.error(...))` so transport
   / serialization errors surface and don't become unhandled-rejection
   process crashes.

3. Same fix for the terminal `sendUpdate` inside `emitTerminal` (the
   backgroundedStreamId path). Rejections here would previously have
   dropped a `completed` event unnoticed, leaving the tool call stuck
   `in_progress` client-side.

Ordering on the wire remains preserved by Node's stdout stream
serializing writes in insertion order; no additional promise-chain is
required. Sequencing of incremental-vs-terminal updates is already
guarded by the existing `setImmediate` in the completion listener.

All 21 LineBuffer tests + 67 shell tests + 65 acpClient tests pass.
…ndaries

Both shell.ts (the LineBuffer-driven forwarder) and acpClient.ts (the
terminal status emitter) used to tear down a stream_output stream as
soon as the spawning turn's abortSignal fired. That meant a new user
prompt — which aborts the previous turn — would prematurely close a
background stream whose process is still producing output the model
may want to react to in a future turn.

PR 2 removes those turn-abort teardown paths:
  - shell.ts no longer registers `signal.addEventListener('abort', …)`
    for stream_output mode. The subscribe listener tears down only on
    the process's own `exit` event.
  - acpClient.ts runTool no longer registers the abort handler that
    fired the terminal tool_call_update(completed). Terminal status
    now fires exclusively on onBackgroundComplete (real process exit).
    Session-close cleanup will land in a follow-up commit.

This is the minimum change to support the ConsultaSkill use case
(agent reacts to background file-watcher events in a later turn): the
sessionUpdate() notification API is already documented to work outside
an active prompt and Zed's AcpThread tolerates out-of-turn updates,
so no ACP protocol extensions are required.

Test "flushes pending stream_output lines and unsubscribes on abort"
— which locked in the PR 1 turn-end teardown — is replaced by
"stream_output listener outlives the spawning turn abort" which
verifies: lines emitted AFTER abort still propagate; only the 'exit'
event finally tears the listener down.
Extends ExecuteOptions.updateOutput with an optional second `_meta`
argument that the ACP layer forwards verbatim to the tool_call_update
`_meta` field. ACP spec explicitly reserves `_meta` for
client/agent-specific metadata (types.gen.d.ts: "MUST NOT make
assumptions about values at these keys" + vendor-prefix convention).

The shell tool's stream_output path now tags every streamed batch with
`{ 'gemini-cli/stream_output': true, pid: <number> }`. Sidecars
(e.g. the ConsultaSkill whisper-queue) can filter on this marker to
distinguish background-stream events from normal in-turn progress
updates (binary-detection notices, periodic throttled foreground
output), without parsing the text or the tool name.

Non-shell / non-stream-output updates continue to omit `_meta`
entirely — the field is strictly additive.
Replaces the "stream is tied to the current turn" bullet with the new
PR 2 behavior: streams outlive the spawning turn and keep emitting
in_progress tool_call_update events against the original toolCallId
until the process actually exits. Documents the
`_meta['gemini-cli/stream_output']` marker so sidecar authors know how
to filter.

Adds a new "Reacting automatically while the agent is idle" section
that explains the ACP protocol limit (agents cannot initiate turns)
and the recommended sidecar pattern (FIFO buffer + consolidated
re-prompt on turn end) for closing the ConsultaSkill-style end-to-end
loop without upstream protocol changes. Points at the existing
acp-wiretap.js as a starting point.
Adds the missed regression test for PR 2's core semantic: when a tool
returns backgroundedStreamId, the ACP wrapper must NOT emit a terminal
tool_call_update on the turn's abortSignal. The test makes a tool
return backgroundedStreamId, snapshots the updates emitted so far,
yields a macrotask, and asserts zero new updates arrived — in
particular, no status:'completed'. This locks in the "stream outlives
the spawning turn" contract against future regressions that might
restore the abortHandler wiring.

Missed in the feat/stream-output-async commit sequence because I moved
to the docs commit without git-adding this file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant