LLM cost attribution

LLM cost attribution is the bridge between provider invoices and operational decisions. Providers bill by account, model, token class, and sometimes region or feature. Businesses need to understand cost by team, product, customer, workflow, and unit of value. The attribution layer connects those two views without losing reconciliation to the raw invoice.

The key is to tag requests before they reach the provider or gateway. Logs created after the fact are usually incomplete. At minimum, every production LLM call should carry a stable endpoint name, environment, team owner, product surface, workload class, and correlation ID. Where contracts allow it, add tenant or customer ID. For agent workflows, tag each tool call or model step separately.

Request count is not attribution

Allocating spend by request count is almost always wrong. One long-context reasoning request can cost more than hundreds of short classification calls. Attribution needs token classes and model pricing. It also needs retry and fallback data, because a “successful” response may hide three failed upstream calls.

Recommended dimensions

Provider, model, and model version where available.
Input, output, cache-write, cache-read, and batch token classes.
Team, product, feature, endpoint, and environment.
Workload class: realtime user path, background job, eval, enrichment, agent step, support workflow, or analytics task.
Customer or tenant attribution when permitted by privacy and contract boundaries.

Invoice reconciliation

Attribution must reconcile back to the provider invoice. Expect small differences from rounding, delayed invoice lines, and provider-specific discounts, but large gaps mean the log pipeline is missing traffic or using stale price tables. The finance view should show both allocated spend and unallocated spend so data-quality issues are visible.

From showback to action

The first output should be showback: which teams and product surfaces are driving spend, and which workloads changed since last month. Once teams trust that data, use it to rank optimization work. Attribution is valuable because it points directly at the owners who can approve routing, caching, or prompt changes.

Back to research