iframe-proxy

jrob5756 · 2026-07-02T15:52:10Z

Summary

Fixes #265. The end-of-run cost summary silently summed only the priced subset of agents and presented it as the complete total, so spend on models without available pricing vanished with no signal. This addresses both parts of the issue.

Part 1 — surface unpriced agents (no more silent undercount)

WorkflowUsage gains unpriced_agents / unpriced_models / has_unpriced. total_cost_usd still returns the partial sum (backward-compatible).
Execution summary exposes unpriced_agent_count / unpriced_models.
CLI renders Total: ~$X (N agents unpriced: model-a, …) and Cost data unavailable (…) instead of a clean-looking wrong number.
Web dashboard tracks an unpricedCount and the StatusBar shows a ~$X (N unpriced) badge.

Part 2 — pricing as a provider hook (static table as fallback)

New AgentProvider.get_model_pricing(model) -> ModelPricing | None (base default None), mirroring the existing get_max_prompt_tokens precedent.
Resolution order in engine/pricing.py: workflow cost.pricing override → provider hook → DEFAULT_PRICING → unpriced.
Copilot derives USD from the SDK billing.token_prices (AI Credits per batch; documented 100 credits = $1 constant; guarded — never raises, falls back to the table). Claude / hermes inherit the None default.
WorkflowEngine._ensure_pricing_resolved bridges the async hook to the sync UsageTracker.record (called before all 5 record sites), resolving each model once and caching on the tracker.
Refreshed DEFAULT_PRICING with current Claude (Opus 4.7/4.8 +1m, Sonnet 5) and GPT-5.x (5.3/5.4/5.5, GPT-5 mini) models.

Concurrency fix (caught in review)

The first cut marked a model resolved before awaiting the hook, so parallel/for-each siblings sharing a model raced ahead and recorded as unpriced — reintroducing the exact undercount on the concurrent path. Fixed with a per-model asyncio.Lock and mark-resolved-after-cache. Regression test test_pricing_hook.py::test_concurrent_same_model_agents_all_priced was verified to fail on the old logic ([40.0, None, None, None]) and pass on the fix, with the hook still called exactly once.

Testing

uv run pytest -m "not real_api and not install_scripts" → 3729 passed, 13 skipped.
ruff check + ruff format --check clean; ty check src clean (one pre-existing unrelated dialog_evaluator.py diagnostic).
make validate-examples → all 30 examples validate.
Frontend rebuilt via make build-frontend.

New tests: unpriced surfacing (test_usage.py), pricing precedence (test_pricing.py), base hook default (test_base.py), Copilot derivation + guards (test_copilot.py), engine end-to-end + concurrency (test_pricing_hook.py), CLI rendering (test_logging.py).

🤖 Draft PR opened via git-workflow skill.

Cost summaries silently summed only the priced subset of agents and presented it as the complete total, so spend on models without available pricing vanished with no signal (#265). - Surface unpriced agents: WorkflowUsage gains unpriced_agents / unpriced_models / has_unpriced (total_cost_usd stays a partial sum). The CLI summary and dashboard status bar now render "~$X (N agents unpriced: ...)" instead of a clean-looking wrong number. - Add AgentProvider.get_model_pricing hook (base default None). Pricing resolves override -> provider hook -> DEFAULT_PRICING -> unpriced. Copilot derives USD from the SDK billing.token_prices (AI Credits, 100 credits = $1; guarded, never raises); other providers fall back to the table. Refresh DEFAULT_PRICING with current Claude/GPT-5.x models. - Bridge the async hook to the sync UsageTracker.record via WorkflowEngine._ensure_pricing_resolved (per-model asyncio.Lock so concurrent parallel/for-each siblings sharing a model are all priced). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Addresses PR review findings on the #265 provider pricing hook. - copilot: reject negative / NaN / inf token prices (a malformed SDK price would supersede the static table and emit a confident-wrong cost); a genuine 0.0 is still priced as free. Move model-dict building inside the try and validate via a non-raising TypeGuard so get_model_pricing truly honors its never-raises contract. Ignore billing.multiplier and the long_context tier (documented). - engine: surface a systemic pricing-hook failure once per run via a warning latch (mirrors _budget_unpriced_warned) so a broken live-pricing integration is diagnosable instead of silently degrading every model. - pricing: make get_pricing's overrides / provider_pricing keyword-only so the two same-typed dicts can't be transposed; fix the gpt-5-mini table comment and several docstrings to match actual behavior. - tests: add real-path parallel-group and for-each E2E coverage of the pricing wiring (previously only the single-agent path ran end-to-end); add negative/NaN/inf/zero-price, multiplier-ignored, malformed-list, and warning-latch cases; cover the CLI no-model-list and fully-priced branches. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

_is_finite_nonneg sat outside the SDK try/except, so a pathologically large int price (unconvertible to float, e.g. 10**400) would make math.isfinite raise OverflowError -- a latent hole in get_model_pricing's never-raises contract. Catch it in the guard so such a value is rejected (falls back to the static table) rather than raised. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The for-each E2E test passed for the wrong reason: finder shared _UNPRICED_MODEL, so it resolved (and cached) the price at its main-loop record site before the items ran, making the for-each site's resolution an untested no-op fast path. Give finder a table-priced model (gpt-4o) so the for-each site is the sole resolver of _UNPRICED_MODEL; dropping _ensure_pricing_resolved from _execute_for_each_group now fails the test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jrob5756 and others added 4 commits July 2, 2026 12:59

jrob5756 force-pushed the fix/265-cost-summary-unpriced-pricing branch from 901dc75 to 3a2ca92 Compare July 2, 2026 17:03

jrob5756 marked this pull request as ready for review July 2, 2026 17:11

jrob5756 merged commit 3ecdfea into main Jul 2, 2026
9 checks passed

jrob5756 deleted the fix/265-cost-summary-unpriced-pricing branch July 2, 2026 17:12

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(cost): surface unpriced models and add provider pricing hook#279

fix(cost): surface unpriced models and add provider pricing hook#279
jrob5756 merged 4 commits into
mainfrom
fix/265-cost-summary-unpriced-pricing

jrob5756 commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

jrob5756 commented Jul 2, 2026

Summary

Part 1 — surface unpriced agents (no more silent undercount)

Part 2 — pricing as a provider hook (static table as fallback)

Concurrency fix (caught in review)

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant