feat: add plan and computer-use subagent model overrides by ibetitsmike · Pull Request #24679 · coder/coder · GitHub
Skip to content

feat: add plan and computer-use subagent model overrides#24679

Draft
ibetitsmike wants to merge 2 commits intomike/subagent-config-t9gffrom
mike/subagent-config-t9gf-plan-stack-build
Draft

feat: add plan and computer-use subagent model overrides#24679
ibetitsmike wants to merge 2 commits intomike/subagent-config-t9gffrom
mike/subagent-config-t9gf-plan-stack-build

Conversation

@ibetitsmike
Copy link
Copy Markdown
Collaborator

Stacks on #24610.

What changed

  • add plan_subagent and computer_use override contexts to the chat config API and site_configs
  • apply the plan subagent override only when delegated child chats continue running in plan mode
  • add computer-use compatibility reporting, child metadata handling, and runtime fallback to the Anthropic default when the saved override cannot drive the computer tool
  • expand the Agents settings UI and tests for the new override contexts

Validation

  • make gen
  • go test ./coderd -run 'TestChatModelOverrides'
  • go test ./coderd/x/chatd -run 'TestSpawnAgent_(GeneralFromPlanParentUsesPlanSubagentOverride|GeneralFromPlanParentFallsBackWhenPlanSubagentOverrideUnset|ExploreFromPlanParentUsesExploreOverrideAndClearsPlanMode|PlanSubagentOverrideLogsAndFallsBackWhenCredentialsUnavailable|ComputerUseUsesConfiguredOverrideWhenCompatible|ComputerUseLeavesChildMetadataOnParentModelWhenOverrideIncompatible|ComputerUseSubagentToolsAndModel)'
  • pnpm -C site lint:types
  • pnpm -C site test:storybook -- AgentSettingsAgentsPageView.stories.tsx
  • make lint
  • make pre-commit

Mux is acting on Mike's behalf.

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good layer completeness: every new override context (plan_subagent, computer_use) is propagated through all necessary layers (SQL, dbauthz, dbmetrics, dbmock, querier, SDK, API handlers, subagent catalog, subagent resolution, frontend UI/stories). Test line ratio is above 1:1 against non-generated production lines. The plan subagent test matrix (override set, unset fallback, explore clears plan mode, credentials unavailable) is well-decomposed with clear negative assertions.

1 P1, 1 P2, 7 P3, 2 Nit.

The P1 (DEREM-3) is the direct cause of all three CI test failures. The P2 (DEREM-4) is a behavioral change for existing deployments that should be either a deliberate choice with updated UI descriptions or a fallback chain.

"Shall I show you? ♥" (Hisoka, on the spawn-vs-runtime dual resolution divergence)

🤖 This review was automatically generated with Coder Agents.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 [DEREM-3] Four new querier methods (GetChatPlanSubagentModelOverride, UpsertChatPlanSubagentModelOverride, GetChatComputerUseModelOverride, UpsertChatComputerUseModelOverride) have dbauthz.go implementations but no corresponding test entries in dbauthz_test.go. TestMethodTestSuite.TearDownSuite enforces that every querier method is covered by at least one dbauthz RBAC assertion test.

"All six appear in the TestMethodTestSuite/Accounting failure output" (Kite)

Two pre-existing methods (GetChatGeneralModelOverride, UpsertChatGeneralModelOverride) are also missing, likely from the stacked PR #24610. This is the direct cause of the three CI failures (test-go-pg, test-go-pg-17, test-go-race-pg). Add s.Run(...) entries in TestChats() following the existing GetChatExploreModelOverride / UpsertChatExploreModelOverride pattern at lines 893 and 1208, with .Asserts(rbac.ResourceDeploymentConfig, policy.ActionRead) for gets and .Asserts(rbac.ResourceDeploymentConfig, policy.ActionUpdate) for upserts. (Kite P1, Razor Nit)

🤖

id: subagentTypeGeneral,
description: "delegated work that may inspect or modify workspace files",
buildOptions: func(ctx context.Context, p *Server, parent database.Chat, _ uuid.UUID, _ string) (childSubagentChatOptions, error) {
overrideContext := codersdk.ChatAgentModelOverrideContextGeneral
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 [DEREM-4] When a plan-mode parent spawns a general child and plan_subagent is unset (the default for every existing deployment), the child inherits the parent's user-selected model. The general override is never consulted.

Before this PR, buildOptions always resolved ChatAgentModelOverrideContextGeneral. After, when the parent is in plan mode, the override context switches to ChatAgentModelOverrideContextPlanSubagent. If unset, resolveSubagentModelConfigID returns uuid.Nil, no override is applied, and the child falls back to the parent model.

"An admin who configured the general override to enforce a cheaper or approved model across all delegated work will see plan-mode children quietly bypass it" (Mafuuu)

Concrete scenario: admin sets general override to claude-sonnet-4 to control costs. User creates a plan-mode chat selecting gpt-4o. User spawns a general subagent. Pre-PR: child uses claude-sonnet-4 (general override). Post-PR: child uses gpt-4o (parent model). No log, no UI hint, no documentation.

Two viable fixes:

  1. Fall back to general override when plan_subagent is unset: if modelConfigID == uuid.Nil && overrideContext == ChatAgentModelOverrideContextPlanSubagent { modelConfigID, err = p.resolveSubagentModelConfigID(ctx, parent.OwnerID, ChatAgentModelOverrideContextGeneral) }
  2. If the no-fallback behavior is intentional, update the plan subagent description to say "When unset, plan-mode subagents use the parent chat's model (the general override does not apply)" and narrow the general model description to exclude plan-mode children.

(Mafuuu P2, Pariston P2, Knov P2, Meruem P2, Luffy P2, Kite P3, Razor P4)

🤖

Comment on lines +322 to +328
providerName, _, err := chatprovider.ResolveModelWithProviderHint(
modelConfig.Model,
modelConfig.Provider,
)
if err != nil {
p.logger.Info(ctx,
"computer use model override metadata is invalid, ignoring",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 [DEREM-5] resolveComputerUseOverrideModelConfig calls ResolveModelWithProviderHint a second time on a model config whose provider was already resolved and validated inside resolveConfiguredModelOverride (via resolveModelConfigAndNormalizedProvider -> validateModelConfigAndResolveProvider). The second call cannot fail because it uses the same immutable struct fields that already passed the identical check. The error handler at lines 326-337 ("computer use model override metadata is invalid, ignoring") is dead code.

"The root cause is structural: resolveConfiguredModelOverride discards useful information (the resolved provider name)" (Meruem)

Fix: widen resolveConfiguredModelOverride to return the provider name alongside the model config, then resolveComputerUseOverrideModelConfig can check isComputerUseProviderCompatible(providerName) directly without re-resolving, and the dead error path is removed. (Hisoka P3, Mafuuu P3, Pariston P3, Chopper P3, Knov P3, Meruem P3, Zoro P3)

🤖

Comment thread coderd/exp_chats.go
Comment on lines +614 to +622
if xerrors.Is(err, sql.ErrNoRows) {
return chatAgentModelOverrideEffectiveness{}, nil
}
return chatAgentModelOverrideEffectiveness{}, xerrors.Errorf(
"get chat model config: %w",
err,
)
}
if !modelConfig.Enabled {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 [DEREM-6] getChatAgentModelOverrideEffectiveness returns empty (isEffective = nil) when the saved model config is deleted (line 614), disabled (line 622), or has invalid metadata (line 630). The frontend renders a warning only when is_effective === false. So an admin who saves a computer use override for a model that later gets disabled sees a saved config ID, no malformed flag, no effectiveness warning, while every runtime invocation silently falls back to the Anthropic default.

"The isEffective field establishes a contract: 'I will tell you whether your override works for this context.' It checks provider compatibility but misses model availability. An incomplete promise is worse than no promise." (Hisoka)

Return isEffective: ptr.Ref(false) with an appropriate ignoredReason for the disabled-model and deleted-model cases. The disabled-model case is the most likely to occur in practice (admin disables a model after setting the override). (Hisoka P3, Ryosuke P3, Razor P3, Chopper P3, Bisky P4)

🤖

Comment thread coderd/exp_chats.go
ignoredReason string
}

func chatEnabledProvidersContain(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 [DEREM-1] chatEnabledProvidersContain is a byte-for-byte duplicate of the existing enabledProviderContainsName in coderd/x/chatd/subagent.go:165. Both normalize provider names with chatprovider.NormalizeProvider and do an identical linear scan. The existing function is unexported in chatd, so coderd cannot call it directly, but both packages already import chatprovider. Extract to a shared chatprovider.EnabledProvidersContain(providers []database.ChatProvider, name string) bool to centralize the normalization contract. Ging-Go also notes this could use slices.ContainsFunc (stdlib since 1.21). (Netero, Robin P3, Zoro P3, Gon P3, Ryosuke P3, Razor P3)

🤖

require.Equal(t, computerUseOverride.ID, childChat.LastModelConfigID)
}

func TestSpawnAgent_ComputerUseLeavesChildMetadataOnParentModelWhenOverrideIncompatible(t *testing.T) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 [DEREM-8] TestSpawnAgent_ComputerUseLeavesChildMetadataOnParentModelWhenOverrideIncompatible uses newInternalTestServer (default logger) and cannot assert log output. The analogous TestSpawnAgent_PlanSubagentOverrideLogsAndFallsBackWhenCredentialsUnavailable (line 808) uses newInternalTestServerWithLogger with a subagentTestLogSink and asserts exactly one "model override credentials are unavailable, ignoring" entry. The production code at subagent.go:339 emits "computer use model override is incompatible, ignoring" when the provider compatibility check fails, but nothing verifies that log fires. Wire a subagentTestLogSink and assert the "incompatible" log message. (Bisky P3)

🤖

Comment thread coderd/exp_chats.go
overrideContext codersdk.ChatAgentModelOverrideContext,
modelConfigID *uuid.UUID,
) (chatAgentModelOverrideEffectiveness, error) {
if overrideContext != codersdk.ChatAgentModelOverrideContextComputerUse ||
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 [DEREM-9] The admin effectiveness check at getChatAgentModelOverrideEffectiveness evaluates provider compatibility using GetEnabledChatProviders (deployment-level enabled state). The runtime resolver at chatd.go:6301 evaluates compatibility using pre-resolved providerKeys (user-level credential availability). These two paths can disagree: a provider may be listed as enabled at the deployment level but have no credentials available at runtime (e.g., admin removed the central API key but the provider config entry remains). The admin API reports is_effective: true while the runtime silently falls back to the Anthropic default. The is_effective field is supposed to give the admin confidence their override works; a false positive undermines that signal. (Hisoka P2, Chopper P2)

🤖

@@ -1507,17 +1715,28 @@ func TestSpawnAgent_ComputerUseUsesComputerUseModelNotParent(t *testing.T) {
ctx := chatdTestContext(t)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 [DEREM-10] TestSpawnAgent_ComputerUseUsesConfiguredOverrideWhenCompatible sets an override and checks childChat.LastModelConfigID. It never runs the child chat. TestComputerUseSubagentToolsAndModel in chatd_test.go runs the full spawn-to-LLM flow and asserts the Anthropic API receives the expected model name, but it uses the hardcoded default, not the override. Given the dual resolution architecture (spawn-time DB write vs runtime fresh resolve), the gap between "the DB says this model" and "the LLM received this model" is exactly where a bug would hide, and neither test proves the override flow works end-to-end. (Hisoka P2, Chopper Note)

🤖

Comment on lines +88 to +93
description="Deployment-wide model override for delegated subagents with write capabilities, such as editing files or running commands in the workspace."
level="section"
/>
<SubagentModelOverrideSettings
title="General model"
description={generalModelDescription}
description="Deployment-wide model override for delegated subagents with write capabilities, such as editing files or running commands in the workspace."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit [DEREM-11] Each section description is duplicated as an inline string literal between SectionHeader and SubagentModelOverrideSettings (lines 88/93, 113/118, 167/172). The SubagentModelOverrideSettings copy is dead code when showHeader={false}. Extract each description into a const (as the pre-existing code did with generalModelDescription) or stop passing description to SubagentModelOverrideSettings when showHeader is false. (Nami, Robin, Zoro, Gon, Leorio)

🤖

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit [DEREM-2] The for loop at line 342 correctly checks all 4 sections display the malformed warning, but only plan_subagent and computer_use get the click-save-verify cycle. General and explore malformed sections are not save-tested. The story name claims "Clearable and Saveable" for all sections but only demonstrates it for two. (Netero, Chopper Nit)

🤖

@ibetitsmike ibetitsmike force-pushed the mike/subagent-config-t9gf branch from a9f807e to 5f1d7aa Compare April 23, 2026 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant