Cost per accepted change $ AC Cost Per Accepted Change

For the engineer doing the setup

Instrumentation guide

How to actually wire your stack to produce cost per accepted change every window without rebuilding observability from scratch. Organized by numerator component, with a minimum-viable pipeline at the end.

1. Model cost

The single most leveraged instrumentation choice. Without per-team or per-commit attribution, you can only allocate aggregate model spend by head-count or guess.

Minimum viable

Production

Run an LLM proxy in front of every provider call. The proxy adds: per-request logging, custom tags, retries, budgets, and cache analytics — the data you need to debug a moving model-cost component.

Whichever you pick, ensure each request carries tags for: team, project or repo, actor (user or agent), and purpose (e.g., code-gen, review, chat). Without tags, the proxy gives you nicer aggregate data but the same attribution problem.

Per-commit attribution

If you want model cost attributed to specific accepted changes, you need to know which commits an LLM touched. git-ai is the maturing tool here — it links AI-written lines to the agent, model, and transcripts that generated them via a git extension. Combined with provider pricing, this lets you compute the model cost per accepted change unit rather than spread total spend evenly.

2. Infrastructure cost

The compute, storage, observability, CI/CD, and agent-runner overhead attributable to producing changes.

Minimum viable

Production

3. Engineering time

The time team members spend specifying, prompting, integrating, and steering AI work, converted to currency at a loaded hourly rate.

Minimum viable

Use a blended fully-loaded rate × planned capacity. Most organizations already have these numbers for capacity planning:

This overestimates active delivery time slightly, which is the right direction — it absorbs meetings, interruptions, and the real cost of context-switching.

Production

4. Review cost

Time spent reviewing and gating AI-generated work, converted to currency.

Minimum viable

Sample a representative week. Ask reviewers to track time spent on PR reviews for one week. Multiply by 4 (or however many weeks in your window). Adjust for known seasonality. Use the team's blended hourly rate.

Production

5. Rework cost

The trickiest component to instrument, and the one most teams ignore — which is exactly why catching it matters.

Mining reverts from git

Three signals to capture, ordered from most reliable to most subjective:

(a) Explicit git revert commits. Built-in syntax; commit messages are prefixed Revert "...".

# All revert commits in a window
git log --grep='^Revert "' \
  --since=2026-04-01 --until=2026-04-30 \
  --pretty=format:'%h %s'

# Or via gh, including the original PR
gh pr list --state merged \
  --search 'merged:2026-04-01..2026-04-30 "reverts #" in:body' \
  --json number,title,body,additions,deletions

(b) PRs labeled as fixes or hotfixes. Requires team discipline on labels, but very cheap once in place:

gh pr list --state merged \
  --search 'merged:2026-04-01..2026-04-30 label:fix,hotfix,bug' \
  --json number,title,additions,deletions

(c) Conventional Commits. If your team uses Conventional Commits, you get free typing of every commit (fix:, revert:, feat:). Parse the merged commit messages:

git log --since=2026-04-01 --until=2026-04-30 --merges \
  --pretty=format:'%s' | grep -E '^(fix|revert)(\(.+\))?: '

For each identified revert / fix, capture the hours spent. The cheapest approach is a manual hour estimate per ticket reviewed in the window's quarterly review. The most rigorous is to attach a time-spent field to each fix ticket and roll up automatically.

Mining fix tickets from your issue tracker

If your team uses Jira, Linear, or GitHub Issues, fix tickets are usually well-typed:

Tying fixes back to the original change

For rigorous attribution, link each fix back to the PR or commit that introduced the defect:

Tied fixes let you compute the more rigorous "stayed there" check: each merged PR is examined N days later; if a tied fix was merged within the window, the original is excluded from the denominator and the fix's cost lands in the numerator.

The minimum-viable approach

If you have nothing today, start by pulling the revert commits and the bug-labeled PRs in the window, eyeballing each, and assigning a rough hour estimate per fix. A team of 10–30 engineers typically has 5–25 such items in a four-week window; an hour of triage produces a defensible rework-cost number.

6. The denominator — accepted change units

The other half of the metric. Pull merged PRs, apply the 500-LOC normalization, filter by the survival window.

The recipe

# Step 1: list merged PRs in the window
gh pr list --state merged \
  --search 'merged:2026-04-01..2026-04-30' \
  --json number,additions,deletions,mergedAt,title --limit 1000 \
  > merges.json

# Step 2: for each PR, check it hasn't been reverted or fix-followed
#   within the 30-day survival window. Filter merges.json to surviving PRs.

# Step 3: normalize via the 500-LOC rule
jq '[.[] | (.additions + .deletions) | select(. > 0) | (. / 500) | ceil] | add' merges.json

The full recipe lives in the FAQ; the calculator library exports normalizeChanges() for the LOC step.

Putting it together: the monthly pipeline

A minimum-viable monthly script. Cron it for the first of the month, fetching the prior month's data:

#!/usr/bin/env bash
# Run on the 1st of every month, computing the prior month's CPAC.
# Adjust dates and team key for your setup.

WINDOW_START="2026-04-01"
WINDOW_END="2026-04-30"
TEAM_KEY="my-team"
HOURLY_RATE=150

# 1. Model cost — pull from your provider's billing API for $TEAM_KEY
MODEL_COST=$(curl -s ".../usage?key=$TEAM_KEY&start=$WINDOW_START&end=$WINDOW_END" | jq '.cost_usd')

# 2. Infra cost — pull from cloud billing tagged with team=$TEAM_KEY
INFRA_COST=$(aws ce get-cost-and-usage ... | jq '.ResultsByTime[0].Total.AmortizedCost.Amount')

# 3. Engineering time — blended rate × planned capacity
ENG_HOURS=1280   # 10 engineers × 4 weeks × 32h
ENG_COST=$((ENG_HOURS * HOURLY_RATE))

# 4. Review cost — sample week × 4
REVIEW_HOURS=40   # from sampled week
REVIEW_COST=$((REVIEW_HOURS * HOURLY_RATE))

# 5. Rework cost — fix and revert PRs × rough estimate
REWORK_HOURS=$(gh pr list --search "merged:$WINDOW_START..$WINDOW_END label:fix,hotfix,revert" \
  --json number | jq 'length * 2')   # rough: 2h per fix on average
REWORK_COST=$((REWORK_HOURS * HOURLY_RATE))

# 6. Accepted change units — gh + jq + ceil
UNITS=$(gh pr list --state merged --search "merged:$WINDOW_START..$WINDOW_END" \
  --json additions,deletions --limit 1000 \
  | jq '[.[] | (.additions + .deletions) | select(. > 0) | (. / 500) | ceil] | add')

# 7. Compute and report
TOTAL=$((MODEL_COST + INFRA_COST + ENG_COST + REVIEW_COST + REWORK_COST))
echo "Window: $WINDOW_START to $WINDOW_END"
echo "Model:    \$$MODEL_COST"
echo "Infra:    \$$INFRA_COST"
echo "Eng:      \$$ENG_COST"
echo "Review:   \$$REVIEW_COST"
echo "Rework:   \$$REWORK_COST"
echo "Total:    \$$TOTAL"
echo "Units:    $UNITS"
echo "CPAC:     \$$((TOTAL / UNITS))"

Append the output as a row in the tracker spreadsheet and you have a defensible monthly time series. Refine each step over time as the metric proves its value.

Honest caveats

For the broader operational guidance (who runs the measurement, how often, what to report), see the quick-start playbook. For where to push back on common critiques of the metric, see the measurement comparison page.


Found a better tool or a sharper script for any of these components? Open an issue at the repo. The most useful updates to this page come from teams sharing what they built.