{{ message }}
fix(runtime): make codex Responses shim work against non-OpenAI backends#592
Merged
Conversation
The codex runtime bridges Codex (Responses API only) onto a chat backend (e.g. Volcengine Ark) via the in-process shim. Two incompatibilities made every turn fail with a generic "high demand" error: - Codex injects built-in tools (e.g. `web_search`) whose schema carries OpenAI-only fields like `external_web_access`. Ark's stricter Responses endpoint rejects the unknown field (BadRequest), which Codex retries and then surfaces as "high demand". Sanitize the inbound request to keep only standard `function` tools. - litellm's chat->Responses bridge can only emit a single degenerate `response.completed` event when streaming a chat backend; Codex's strict SSE parser rejects that. Call the backend non-streaming and synthesize the canonical Responses event sequence (response.created -> per-item output_item.added / text+reasoning deltas / output_item.done -> response.completed) ourselves. Verified end-to-end: Agent(runtime="codex") on Ark now returns a real answer.
…exit The shim's uvicorn server runs as a background task that is never stopped, so the event loop cancels it at process exit, and uvicorn logs a CancelledError traceback from its lifespan handler. The shim app has no startup/shutdown hooks, so disable the lifespan protocol (lifespan="off"). Cosmetic only.
…inal message A reasoning model (e.g. DeepSeek) sometimes ends a turn with only a reasoning item and no final agentMessage, leaving TurnResult.final_response empty — result_to_events then returned [] and the turn printed nothing. Surface the reasoning summary (with a short note) in that case so a turn is never silently empty. Verified across repeated runs: no more empty outputs.
zakahan
approved these changes
Jun 9, 2026
yaozheng-fang
added a commit
that referenced
this pull request
Jun 9, 2026
…idelity (#593) Follow-up to #592. Makes the codex runtime work for multi-step tool turns and forward the whole turn faithfully, by aligning the shim and the ADK mapping with the Codex Responses protocol and the genai/ADK event shapes. proxy.py (the Responses shim): - Stream `function_call` output items (output_item.added -> function_call_arguments.delta/.done -> output_item.done). Previously only message/reasoning were streamed, so a tool call was dropped and the turn ended at the model's preamble. - Backfill `status="completed"` on replayed assistant messages in `input`; Ark's Responses API requires it (MissingParameter: input.status) and Codex replays them without it on multi-step turns. translate.py (result -> ADK events): - Forward every Codex thread item in order instead of collapsing to final_response: reasoning -> thought text; commandExecution / mcpToolCall / dynamicToolCall / fileChange / webSearch -> function_call + function_response; agentMessage / plan / any text-bearing item -> text; userMessage skipped. - Coerce tool-call arguments to a dict and normalize status enums; fall back to final_response so a turn is never silently empty.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Problem
Agent(runtime="codex")bridged onto a non-OpenAI chat backend (e.g. Volcengine Ark) failed every turn with a genericRuntimeError: We're currently experiencing high demand. Two distinct incompatibilities in the Responses shim were the cause:functiontools, Codex sends aweb_searchtool whose schema carries OpenAI-only fields likeexternal_web_access. Ark's stricter Responses endpoint returnsBadRequest: unknown field "external_web_access"; Codex retries it and then surfaces the generic "high demand" message.response.completedevent when streaming a chat backend (noresponse.created, nooutput_text.delta). Codex's strict SSE parser rejects that and reports the same generic error.(Follow-up to #591, which fixed the
openai_codeximport.)Fix (
veadk/runtime/codex/proxy.py)type == "function"tools in the inbound request; dropweb_searchand other built-in OpenAI tool types the bridged backend doesn't understand.response.created→ per output item (output_item.added→output_text/reasoning_summarydeltas →output_item.done) →response.completed.Verification
Agent(runtime="codex")on Ark (deepseek-v4-flash, viapip install openai-codex) now returns a real answer end-to-end. ruff check + format pass.