iframe-proxy

yaozheng-fang · 2026-06-09T05:54:33Z

Problem

Agent(runtime="codex") bridged onto a non-OpenAI chat backend (e.g. Volcengine Ark) failed every turn with a generic RuntimeError: We're currently experiencing high demand. Two distinct incompatibilities in the Responses shim were the cause:

Codex injects built-in tools Ark can't parse. Alongside the standard function tools, Codex sends a web_search tool whose schema carries OpenAI-only fields like external_web_access. Ark's stricter Responses endpoint returns BadRequest: unknown field "external_web_access"; Codex retries it and then surfaces the generic "high demand" message.
Degenerate streaming. litellm's chat→Responses bridge can only emit a single response.completed event when streaming a chat backend (no response.created, no output_text.delta). Codex's strict SSE parser rejects that and reports the same generic error.

(Follow-up to #591, which fixed the openai_codex import.)

Fix (`veadk/runtime/codex/proxy.py`)

Sanitize tools: keep only type == "function" tools in the inbound request; drop web_search and other built-in OpenAI tool types the bridged backend doesn't understand.
Synthesize the stream: call the backend non-streaming, then expand the completed result into the canonical Responses event sequence Codex expects — response.created → per output item (output_item.added → output_text/reasoning_summary deltas → output_item.done) → response.completed.

Verification

agent = Agent(name="Xiaoming", ..., runtime="codex")
runner = Runner(agent=agent, short_term_memory=ShortTermMemory())
await runner.run("你叫什么")
# -> 我叫 **Xiaoming**（小明）！有什么需要帮忙的吗？😊

Agent(runtime="codex") on Ark (deepseek-v4-flash, via pip install openai-codex) now returns a real answer end-to-end. ruff check + format pass.

The codex runtime bridges Codex (Responses API only) onto a chat backend (e.g. Volcengine Ark) via the in-process shim. Two incompatibilities made every turn fail with a generic "high demand" error: - Codex injects built-in tools (e.g. `web_search`) whose schema carries OpenAI-only fields like `external_web_access`. Ark's stricter Responses endpoint rejects the unknown field (BadRequest), which Codex retries and then surfaces as "high demand". Sanitize the inbound request to keep only standard `function` tools. - litellm's chat->Responses bridge can only emit a single degenerate `response.completed` event when streaming a chat backend; Codex's strict SSE parser rejects that. Call the backend non-streaming and synthesize the canonical Responses event sequence (response.created -> per-item output_item.added / text+reasoning deltas / output_item.done -> response.completed) ourselves. Verified end-to-end: Agent(runtime="codex") on Ark now returns a real answer.

…exit The shim's uvicorn server runs as a background task that is never stopped, so the event loop cancels it at process exit, and uvicorn logs a CancelledError traceback from its lifespan handler. The shim app has no startup/shutdown hooks, so disable the lifespan protocol (lifespan="off"). Cosmetic only.

…inal message A reasoning model (e.g. DeepSeek) sometimes ends a turn with only a reasoning item and no final agentMessage, leaving TurnResult.final_response empty — result_to_events then returned [] and the turn printed nothing. Surface the reasoning summary (with a short note) in that case so a turn is never silently empty. Verified across repeated runs: no more empty outputs.

…idelity (#593) Follow-up to #592. Makes the codex runtime work for multi-step tool turns and forward the whole turn faithfully, by aligning the shim and the ADK mapping with the Codex Responses protocol and the genai/ADK event shapes. proxy.py (the Responses shim): - Stream `function_call` output items (output_item.added -> function_call_arguments.delta/.done -> output_item.done). Previously only message/reasoning were streamed, so a tool call was dropped and the turn ended at the model's preamble. - Backfill `status="completed"` on replayed assistant messages in `input`; Ark's Responses API requires it (MissingParameter: input.status) and Codex replays them without it on multi-step turns. translate.py (result -> ADK events): - Forward every Codex thread item in order instead of collapsing to final_response: reasoning -> thought text; commandExecution / mcpToolCall / dynamicToolCall / fileChange / webSearch -> function_call + function_response; agentMessage / plan / any text-bearing item -> text; userMessage skipped. - Coerce tool-call arguments to a dict and normalize status enums; fall back to final_response so a turn is never silently empty.

yaozheng-fang added 3 commits June 9, 2026 13:54

zakahan approved these changes Jun 9, 2026

View reviewed changes

yaozheng-fang merged commit cfac404 into main Jun 9, 2026
16 checks passed

yaozheng-fang mentioned this pull request Jun 9, 2026

fix(runtime): codex tool loop, full-trajectory events, and protocol fidelity #593

Merged

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(runtime): make codex Responses shim work against non-OpenAI backends#592

fix(runtime): make codex Responses shim work against non-OpenAI backends#592
yaozheng-fang merged 3 commits into
mainfrom
fix/codex-shim-ark-compat

yaozheng-fang commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

yaozheng-fang commented Jun 9, 2026

Problem

Fix (veadk/runtime/codex/proxy.py)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix (`veadk/runtime/codex/proxy.py`)