OpenAI cost attribution
OpenAI cost attribution starts before the request is sent. The API response can tell you token usage, but it cannot know which product surface, team, customer, or business workflow caused the call unless your application attaches that context.
Attribution should join three sources: request metadata, usage records, and invoice data. Request metadata gives ownership. Usage records give token counts and model. Invoice data gives the authoritative financial total.
Tags to capture
- Feature or endpoint name.
- Team owner and environment.
- Tenant or customer where allowed.
- Workload class such as realtime, batch, eval, support, or agent step.
- Prompt version, route policy, and fallback status.
Common attribution gaps
Teams often miss retries and fallbacks. A user sees one answer, but the system may have paid for multiple upstream calls. Another common gap is batch work: evals and enrichment jobs should be tagged separately from user-facing traffic so they can move to lower-cost asynchronous lanes.
The best OpenAI attribution reports do not stop at “which model cost the most.” They show which product decision caused that model to be used, and which owner can change it.