← home
RESEARCH · OPERATING MODEL

How to build an LLM CFO function.

20 June 2026

By the LLM CFO team

Someone has to own the AI bill — not just watch it. Owning it means setting budgets, attributing spend to teams, approving the model changes that move cost, and reconciling the dashboard against the invoice. This is the blueprint for standing that function up, whether it ends up being a person, a virtual team, or an outside service.

What an "LLM CFO function" actually is

It is the standing responsibility for the economics of your LLM usage: a named owner, a recurring cadence, and a small set of metrics that decisions are made against. It is the AI-spend analogue of what cloud FinOps became for infrastructure — except token pricing, cache discounts, and reasoning costs make the unit economics stranger, so the function needs its own playbook rather than a re-used cloud one. We lay out the discipline itself in AI FinOps; this page is about turning that discipline into an org function.

A dashboard is not a function

Most teams start with visibility — a cost dashboard wired up from gateway logs. Visibility is necessary and not sufficient. A dashboard shows the number; a function changes it. The gap is ownership: who is accountable when spend per request creeps up 30% after a model swap, who signs off on routing changes, who tells a team its feature is now the most expensive surface in the product. Without that owner, the dashboard becomes wallpaper everyone has stopped looking at.

The four responsibilities

ResponsibilityWhat it means in practice
AttributeEvery dollar of spend maps to a team, product surface, environment, and model. Below ~90% attribution coverage, every other number is suspect.
OptimizeOwn the levers — routing, caching, batch — and the decision to pull them, with quality regression checks attached.
GovernSet budgets and guardrails, approve model and prompt changes that move cost, and define escalation policy.
ReconcileCompare derived spend (token counts × price table) against the provider invoice monthly, with a documented tolerance.

Who runs it

The function lives at the seam between platform engineering and finance, and it fails when it sits entirely on one side. Engineering owns the instrumentation, the gateway, and the levers. Finance owns the budgets, the allocation model, and the reconciliation against the real bill. In a small org this is one engineer with a finance partner and an hour a week; in a larger one it is a virtual team with a clear RACI. What matters is that one named person is accountable for the number, even if many are responsible for moving it. Allocation mechanics — when showback becomes chargeback — are covered in AI chargeback and showback.

The metrics it lives by

The mechanics of capturing these live in LLM cost monitoring and token usage tracking.

The operating cadence

The first 90 days

  1. Weeks 1–2 — Instrument and attribute. Get per-request token data tagged by team, product, environment, and model. Drive attribution coverage above 90% before optimizing anything.
  2. Weeks 3–4 — Baseline and reconcile. Establish cost per successful task and cache-read ratio, then reconcile the derived total against last month's invoice so the numbers are trusted.
  3. Weeks 5–8 — Set budgets and guardrails. Give each team a budget and an alert; add per-request and per-agent ceilings so runaway loops can't surprise you.
  4. Weeks 9–12 — Optimize and institutionalize. Pull the first high-confidence lever (usually caching or routing), measure it against quality, and lock in the weekly/monthly cadence with a named owner.

Build, or buy the function

Standing this up in-house is the right call when LLM spend is large enough to justify a dedicated owner and the expertise exists internally. When it isn't — spend is real but nobody has the bandwidth, or the team would rather not build the instrumentation and reconciliation muscle from scratch — the function can be run as a managed service. That is precisely what LLM CFO does: the attribution, optimization, governance, and reconciliation, run for you. Either way, the test is the same: is there a named owner, a cadence, and a number that decisions are made against?

The one-line test: if no single person can tell you your cost per successful task this week — and what they are doing to move it — you have a dashboard, not an LLM CFO function.

Related

← Back to llmcfo.com