it_support · workflow

incident.io controls AI spending with per-prompt cost attribution and OpenAI project billing limits

incident.io hit OpenAI billing limits and temporarily broke all early-access AI features. As they expanded adoption — particularly Investigations, which costs 100x more than their other AI features — they lacked visibility into which features and code paths were driving spend.

How it works

Common implementation structure

How this type of workflow is generally built, generalized across documented cases — not tied to any one vendor's stack. Click any stage to read what happens there. Specific products that implement these stages appear in “Tools commonly seen” below.

Stage 1 · Per-prompt token attribution

Token usage is attributed to a named prompt type via Go reflection and piped into observability tooling.

Tools used

OpenAISlackGitHubGrafanaGoogle Secret ManagerTerraform

Outcome

incident.io now has control over production, development, and training AI costs, with per-prompt attribution, feature-level billing limits, daily Slack cost reports, and predictive backfill cost estimation, enabling teams to ship AI features confidently.

What failed first

Billing limits were breached and all early-access AI features broke. Without per-prompt tracking, the team had no way to attribute spend to features or code paths, and discovered weeks later they had been burning hundreds of dollars on bugs.

Results

Cost replaced100x

Source

https://incident.io/building-with-ai/controlling-costs

How we source this →

Grounding & classification

Source type: technical build writeup

30 fields verified against source quotes.

agentic workflowconversational aispeech to textsummarizationcall recordingcode diff prfailure mode describedmetric backednamed customerproduction runtime claimedtools describedworkflow describedsoftwarecost reductionemployee productivitytechnical build writeupit supportagentic task executionmonitor detect alert