Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathproof-problem.html
More file actions
130 lines (79 loc) · 15 KB
/
Copy pathproof-problem.html
File metadata and controls
130 lines (79 loc) · 15 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Automotive’s AI Problem Isn’t Speed. It’s Proof. — Clay Nelson</title>
<meta name="description" content="The industry's AI conversation is pinned to the wrong axis. Code generation throughput isn't the bottleneck — continuous attestation is. Moving the trust boundary from the developer's laptop to the governed platform is what turns agentic work from a productivity story into a compliance story.">
<link rel="canonical" href="https://medium.com/@claynelson/automotives-ai-problem-isn-t-speed-it-s-proof-15a1d3cc9cee">
<link rel="stylesheet" href="/site.css">
</head>
<body>
<div class="container">
<a href="/" class="back-link">← Clay Nelson</a>
<article>
<header class="essay-header">
<div class="eyebrow">Essay · Medium · April 2026</div>
<h1 class="essay-page-title">Automotive’s AI Problem Isn’t Speed. It’s Proof.</h1>
<p class="deck">Moving the trust boundary from the developer’s laptop to the governed platform is what turns agentic work from a productivity story into a compliance story.</p>
</header>
<div class="essay-body">
<p>Automotive teams know how to write the code. What they can’t do — at the scale the software-defined vehicle now demands — is continuously demonstrate that they did. That is the proof problem, and it is the real bottleneck.</p>
<p>V&V already eats more than forty percent of automotive software R&D spend. That number is sometimes read as inefficiency. It isn’t — it’s the cost of proof. And proof is exactly the work that AI has, until now, made harder rather than easier: productivity tools running on individual machines add code faster than any review board can attest to it.</p>
<p>The unburden problem has a simple diagnostic question: <em>where does your AI run?</em></p>
<h2>The proof problem</h2>
<p>Functional safety standards — ISO 26262, ASPICE, ASIL-D derivatives — do not actually require that your software be perfect. They require that you be able to show your work. Every requirement traced to a test. Every change linked to a rationale. Every tool in the chain qualified. Every reviewer identified. For a team building an infotainment stack, this is administrative friction. For a team building steering, braking, or ADAS, it is the entire job.</p>
<p>Here’s what changed when LLMs arrived in the IDE. A developer on a Tuesday afternoon can now generate, refactor, and commit more code in a morning than their predecessors could in a week. The code is often good. Sometimes it is excellent. None of it is attested. The machine that produced it is a personal endpoint. The prompt history is in a SaaS service the CISO has never seen. The tools called during generation ran on hardware outside the quality management envelope. For an infotainment team, fine. For ASIL-D, the output may as well not exist.</p>
<p>This is not a hypothetical objection. It is the conversation that stops every serious automotive customer’s AI strategy cold, and it is the conversation German OEM safety engineers will walk directly into within ninety seconds of any technical discussion.</p>
<h2>The finishing problem</h2>
<p>Software work has a shape, and it isn’t uniform. The first cut of a feature is fast; the last cut — integration, verification, traceability, sign-off — is slow. In most industries that long tail is overhead you manage. In safety-critical automotive, it <em>is</em> the deliverable. A line of C that runs correctly on a developer’s laptop is worth essentially nothing until it’s been traced, tested, qualified, reviewed, and attested.</p>
<p>This is why editor-resident AI has, perversely, made things worse rather than better. It attacks the part of the curve that was already fast — authoring — and leaves the expensive part untouched. At scale, it doesn’t just fail to help. It adds load to the verification pipeline, because more generated code means more ungoverned, un-attested artifacts flowing into the bottleneck.</p>
<figure>
<img src="/assets/diagrams/finishing-curve.png" alt="Two horizontal bars comparing traditional software work to work with editor-resident AI. Both show a short authoring segment followed by a longer finishing segment covering integration, verification, traceability, and sign-off. AI compresses only the authoring segment, leaving the finishing segment unchanged. Caption: Productivity tools compressed the wrong part of the curve.">
</figure>
<p>Call it the finishing problem. Productivity tools compressed the wrong part of the curve. The forty percent of R&D spend going to V&V doesn’t shrink when authoring gets faster — it grows. That’s the real math behind the frustration: the dashboards look better, the release gates don’t move, and the senior safety engineer is still reviewing the same trace matrix for the fourth time.</p>
<h2>The axis shift</h2>
<p>Clayton Christensen’s observation in <em>The Innovator’s Dilemma</em> was that incumbents don’t lose by failing to innovate. They lose because the axis of competition quietly shifts beneath them, and by the time the market has moved, the architecture of the incumbent’s business is wrong for the game being played.</p>
<p>Automotive AI is sitting on exactly this kind of shift. The axis everyone is optimizing on is code generation throughput: how fast can the assistant write, refactor, complete. The axis the market actually cares about — and is quietly moving to — is attestation throughput: how fast, and how defensibly, can you prove the code was produced under a regime you control.</p>
<p>On the first axis, every editor-resident assistant looks similar. On the second, they are not comparable at all. A tool that runs inside the developer’s editor, calling external services with keys managed by the user, produces artifacts whose lineage cannot be reconstructed by the organization. A platform-resident agent — one whose execution happens inside the same governed environment where the code, the reviews, the policies, and the identities already live — produces artifacts that are attestable by construction.</p>
<p>This is not a marketing distinction. It is a property of where the trust boundary sits.</p>
<h2>The trust boundary</h2>
<p>The trust boundary is the line between the systems your organization can attest to and the systems it cannot. In a traditional SDLC, that line sat at the repository: what’s in the repo is yours, what’s on the laptop is the developer’s. That model worked because the laptop was, in effect, a keyboard. It produced a few kilobytes of committed text a day, reviewable by a human.</p>
<p>Agentic AI changes the math. A single agent session can produce thousands of lines, call dozens of external services, read and write files across a project, and invoke tools whose outputs feed back into the generated code. If any of that happens outside the governed platform, none of it can be attested to. You cannot sign what you did not observe.</p>
<blockquote>You cannot attest to what you did not observe. Moving the trust boundary outward — from each developer’s machine to the governed platform — is what turns agentic work from a productivity story into a compliance story.</blockquote>
<figure>
<img src="/assets/diagrams/trust-boundary.png" alt="Architectural diagram showing two zones separated by a trust boundary. Left zone contains ungoverned tools outside the boundary — developer workstation, editor-resident AI, third-party APIs. Right zone contains the governed platform — repository, agents, policy engine, identity provider, audit log, reviews — inside the boundary.">
</figure>
<p>That sentence is not a slogan. It is an architectural requirement. It tells you exactly what to look for when evaluating an AI strategy for a regulated context: does the execution happen where your identity provider can see it, where your policy engine can gate it, where your audit log can capture it? If the answer is no — if the boundary is still at the laptop — you don’t have a compliance story. You have a productivity anecdote, and when the program review asks for the evidence chain, the release slips a quarter.</p>
<h2>Platform as product</h2>
<p>Matthew Skelton and Manuel Pais, in <em>Team Topologies</em>, describe a failure mode that is almost universal in large automotive software organizations: the specialist team becomes a handoff queue. Safety engineering, security review, MISRA compliance, tool qualification — these exist as functional silos that stream-aligned teams must wait on. The cognitive load of navigating them is enormous. The time-to-market consequence is brutal.</p>
<p>Their prescription is to convert those silos into platforms — more precisely, to treat compliance expertise as a capability consumed via a platform, not a queue. The compliance review becomes an inline check. The qualification becomes an automated attestation. The specialist’s knowledge is embedded into tooling the stream-aligned team can consume without a handoff.</p>
<p>Agentic compliance is this pattern, applied. A policy-aware agent that runs MISRA or CERT checks inline, attributes findings to requirements, and produces a reviewable artifact is a compliance specialist converted into a service. The review board does not disappear. It stops being a bottleneck. It starts being an API.</p>
<h2>Authority to the edge, evidence at the center</h2>
<p>Stanley McChrystal’s account in <em>Team of Teams</em> describes the reorganization that made distributed decision-making work in a fast-moving operational environment: push authority to the edge so decisions happen at the speed of the situation; hold evidence at the center so the whole organization can see what’s happening in real time.</p>
<p>This is the architecture the trust boundary enables. Developers — and the agents working alongside them — retain authority to make local decisions at speed. The platform, sitting at the center, captures every decision, every tool invocation, every artifact, as evidence. The organization gets the speed of edge autonomy and the accountability of central visibility at the same time.</p>
<p>Without the trust boundary in place, you get one or the other. Tight central control kills speed. Loose edge autonomy kills provability. The trust boundary is what resolves the tension.</p>
<h2>What this looks like in practice</h2>
<p>The architecture that emerges from this reframe has a shape. Requirements live in a queryable ontology — not as Word documents, but as graph-aware artifacts an agent can reason over. Agents orchestrate work against that ontology, invoking specialty tools for specific checks: static analysis, coding standard conformance, test generation, SBOM production, traceability reporting. Execution happens on infrastructure the enterprise already governs — its identity plane, its policy plane, its logging plane. A compliance spanning rail captures the evidence continuously, in a form auditors can read without a month of forensic reconstruction.</p>
<p>This isn’t hypothetical. On an Arm functional safety project running today, safety documentation is authored as Sphinx-Needs directives — structured, machine-readable requirements, test cases, and evidence items that live in the repository alongside the source code. UseBlocks’ ubCode extension exposes that documentation as a live traceability graph through an MCP server, and a custom Copilot agent built by InfoMagnus queries it in real time.</p>
<p>The sharpest moment: ask the agent whether a specific test case adequately covers its linked requirement. A structural checker — the kind most compliance tools run today — confirms the link exists and stops there. This agent reads the requirement, decomposes it into testable elements, reads the test case, and reports back that the test covers the obvious rejection condition but never exercises the configurable tolerance boundary — which is the whole point of the requirement. For an ASIL-rated function, that isn’t a note. It’s a gap the structural audit would have missed and the field would have found. None of that work happened on the developer’s laptop. All of it happened inside the trust boundary — captured, attributable, attested.</p>
<p>None of the components are new. What is new is the assembly — and more specifically, that the assembly sits inside the trust boundary rather than crossing it.</p>
<h2>What it does not do</h2>
<p>This reframe does not replace the safety engineer. It does not obviate ISO 26262. It does not mean the review board was wrong. It means the review board finally has a system that gives them evidence in the form they need, at the cadence the product demands.</p>
<p>It also does not pick a winner in the broader AI market. Editor-resident tools have their place — in exploratory work, in non-regulated code paths, in hobby projects. The argument is narrower and more specific: in safety-critical automotive development, the trust boundary determines whether your AI strategy is a compliance story or a liability story. Everything downstream follows from that choice.</p>
<h2>The unburdening</h2>
<p>The word I’ve landed on for this is <em>unburden</em>. Not “accelerate,” not “transform” — unburden. The burden is real. It is the forty percent of R&D spend going to V&V. It is the quarters of delay between feature-complete and release-candidate. It is the senior safety engineer reviewing the same trace matrix for the fourth time because the toolchain cannot produce it cleanly.</p>
<p>That burden doesn’t go away because we wrote faster code. It goes away because we built a system that produces evidence as a natural byproduct of work — and that is only possible when the work happens inside a trust boundary the organization controls.</p>
<p>The software-defined vehicle is a real thing. So is the proof problem. The teams that solve the second one first will ship the first one faster than anyone currently thinks possible.</p>
<hr>
<p><em>A fuller technical walkthrough — the end-to-end loop from code change through autonomous remediation on a real automotive safety project — will follow as a separate piece.</em></p>
</div>
<footer class="essay-footer">
<p>Originally published on <a href="https://medium.com/@claynelson/automotives-ai-problem-isn-t-speed-it-s-proof-15a1d3cc9cee">Medium</a>, April 2026.</p>
<p>Adapted from the keynote “Unburden: AI and the Future of Automotive Verification,” delivered at GitHub Shift: Automotive, Frankfurt, April 21, 2026.</p>
</footer>
</article>
</div>
<script src="/nav.js" defer></script>
</body>
</html>
You can’t perform that action at this time.
