fix(llm): include cache-read tokens in Anthropic total_tokens by he-yufeng · Pull Request #5053 · browser-use/browser-use · GitHub
Skip to content

fix(llm): include cache-read tokens in Anthropic total_tokens#5053

Open
he-yufeng wants to merge 2 commits into
browser-use:mainfrom
he-yufeng:fix/bedrock-anthropic-total-tokens-cache
Open

fix(llm): include cache-read tokens in Anthropic total_tokens#5053
he-yufeng wants to merge 2 commits into
browser-use:mainfrom
he-yufeng:fix/bedrock-anthropic-total-tokens-cache

Conversation

@he-yufeng

@he-yufeng he-yufeng commented Jun 16, 2026

Copy link
Copy Markdown

What

_get_usage computes prompt_tokens as input_tokens + cache_read_input_tokens (Anthropic reports cached prompt tokens separately, so they have to be added back), but total_tokens was left as just input_tokens + output_tokens. As soon as prompt caching kicks in, total_tokens is smaller than prompt_tokens + completion_tokens, which breaks anything downstream that assumes the totals add up (cost tracking, budget/limit checks).

Concrete example with a cached prompt: input_tokens=500, cache_read_input_tokens=10000, output_tokens=200 gives prompt_tokens=10500, completion_tokens=200, but total_tokens=700 instead of 10700.

This affects both ChatAnthropic (browser_use/llm/anthropic/chat.py) and ChatAnthropicBedrock (browser_use/llm/aws/chat_anthropic.py), which share the same usage logic.

Fix

Add the cache-read tokens to total_tokens so it mirrors the prompt_tokens formula and the invariant total_tokens == prompt_tokens + completion_tokens holds again. One line per file.

Verifying

Added tests/ci/models/test_anthropic_usage.py covering both clients: the cached case (asserts the totals add up) and the no-cache case (asserts the number is unchanged).

uv run pytest tests/ci/models/test_anthropic_usage.py -q

The cached-case assertions fail on current main (700 != 10700) and pass with this change; the no-cache case stays at 700 both ways. ruff check / ruff format clean on the touched files.

Note: #4294 proposed the same one-liner for chat.py but went stale before it landed, and it never touched the Bedrock client. This covers both and adds a regression test.


Summary by cubic

Fixes Anthropic usage accounting by adding cache-read prompt tokens to total_tokens, restoring total_tokens == prompt_tokens + completion_tokens. Applies to both Anthropic and Bedrock clients.

  • Bug Fixes
    • Include cache_read_input_tokens in total_tokens in browser_use/llm/anthropic/chat.py and browser_use/llm/aws/chat_anthropic.py.
    • Add tests/ci/models/test_anthropic_usage.py for cached/no-cache cases and type the mock response helper for Pyright.

Written for commit 42351e4. Summary will update on new commits.

Review in cubic

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Re-trigger cubic

@he-yufeng he-yufeng force-pushed the fix/bedrock-anthropic-total-tokens-cache branch from c044127 to 42351e4 Compare June 20, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant