A definition

Cost per accepted change

The fully-loaded cost of producing software that reaches production and stays there, divided by the number of changes that do.

The clause that matters An accepted change is one that reached production and stayed there for a defined survival window — by default, 30 days post-merge. A change that is reverted, repaired, or feature-flagged off within that window is not counted in the denominator; the cost incurred to produce it is counted in the numerator. (Teams may use 14 days for faster cadence or 60–90 days for higher-confidence acceptance — see the FAQ.)

Why this metric

AI did not make software delivery faster. It made code generation faster. The distance between the two — between what organizations expect from AI and what they actually deliver — is the delivery gap.

Token cost, pull-request count, and "AI code share" are activity metrics. They count what was produced, not what was kept. Cost per accepted change is the outcome metric: the number to report to the board.

It is FinOps cost-to-serve moved one layer upstream — the cost to produce delivered software, not the cost to run it.

The formula, expanded

The numerator is a fully-loaded production cost over a measurement window (typically a sprint, a month, or a quarter):

Component	What it captures
Model cost	LLM, API, and inference spend attributable to the changes produced.
Infrastructure cost	Compute, storage, observability, and tooling overhead for the production loop.
Engineering time	Time spent specifying, prompting, integrating, and steering AI work — converted to currency.
Review cost	Time spent reviewing, validating, and gating AI-generated work — converted to currency.
Rework cost	Cost of fixing, reverting, or repairing changes that failed the "stayed there" test.

The denominator is the count of accepted change units: changes that reached production and stayed there during the same window, size-normalized so one unit represents a comparable amount of substantive work. A PR of 1–500 lines counts as 1 unit; a larger PR of N lines counts as ⌈ N / 500 ⌉ units. See the FAQ for the rule in detail.

A worked example

A 10-engineer team measures a four-week window. Loaded engineering time is $150/hour.

Total cost	$28,000
Model cost (LLM API spend)	$1,200
Infrastructure cost	$400
Engineering time (120 hours × $150)	$18,000
Review cost (40 hours × $150)	$6,000
Rework cost (16 hours × $150)	$2,400
Merged PRs that stayed in production	38
After 500-LOC normalization (one PR was 1,800 lines = 4 units; three were 600–1,000 lines = 2 units each)	42 accepted change units
Cost per accepted change	$666.67

That number, reported alongside the volume of activity, is what reveals whether AI is paying for itself. A team shipping 200 PRs but accepting only 42 units has a very different result than a team shipping 50 PRs and accepting 42 units — and the difference is invisible to traditional velocity metrics.

Quick start for leaders → Open the calculator → Get templates →

What this metric is designed to catch

Reviewer load shifted onto your most expensive engineers. If senior engineers are absorbing the verification cost of AI generation, cost per accepted change sees it. PR count does not.
Silent rework. A change that is "merged" but quietly fixed three days later does not stay in production. It is excluded from the denominator and the fix is counted in the numerator.
Optimization theater. Cutting model cost while increasing rework cost is not progress. Cost per accepted change is the bottom line that catches the trade-off.

What it is not

Not a vendor benchmark. It measures your team's full production cost, not the unit cost of an LLM provider.
Not a productivity score for individuals. It is an organizational outcome metric.
Not a substitute for safety, security, or correctness gates. It pairs with them; it does not replace them.

Origin

Cost is the third vertex of the Verification Triangle. Intent clarity and eval quality describe whether the team is building the right thing well; cost describes what it took to get there.

Cost per accepted change was defined in The Delivery Gap (Brenn Hill, 2026) as the cost vertex of that framework. See how to cite, or subscribe at brennhill.substack.com.

This page is the canonical definition. The metric is free to use, adopt, and cite. Refinements and worked examples are welcome via GitHub.