A Support Agent Burned $10,000 in 72 Hours
8:17 a.m. Monday, San Francisco. Jake, CTO of a 12-person SaaS startup, opens his email. His stomach drops.
Over the weekend—72 hours offline, hiking with his partner—their customer support agent had burned $10,237 in API calls. Ten thousand. Three days. For a tool meant to cut support costs.
I sat in on the post-mortem that Wednesday. This wasn't negligence. They had a monthly budget. They picked a "cost-effective" model via OpenRouter. They added rate limits. The agent slipped past every control. By the time anyone looked, it was over.
If you're building or scaling agents, this isn't rare. It's normal. And without the right protections, it's a question of when, not if.
What actually happened
Jake's team built the support agent on AutoGPT, with OpenClaw for routing and OpenRouter for API access. It triaged tickets, pulled order data from Shopify, generated troubleshooting guides, and escalated complex issues.
They set a $500 monthly budget in OpenRouter and a 10-retry limit per tool call. On paper, covered.
They missed one gap: limits were per call, not per workflow.
A customer's order showed delivered but never arrived. The agent tried to pull order data from Shopify. The API timed out. The agent retried 10 times, as configured. All failed. Instead of escalating, it restarted the whole workflow.
It repeated that loop every 8 seconds for 72 hours. 1.2 million tokens. 147,000 API calls. $10k. 99.8% of those calls resolved nothing. The agent never closed a single ticket.
Why agent cost control is harder
A chatbot has predictable tokens per request and clear start/end. Agents decide on their own. They call tools, spin up nested workflows, retry without asking. That autonomy is the point. It's also risky.
Three reasons standard budget tools fail for agents:
Retry logic bypasses rate limits
Agents retry failed tool calls. Good for reliability. Without hard limits per action, workflow, and hour, that logic turns into a runaway loop. Jake's team limited retries per call but not per workflow. Every 10 failed retries, the agent restarted. No one added that guardrail.
Nested calls hide true cost
One user request can trigger many calls: classify → pull data → generate → check → reply. Add nested agents and each has its own calls and retries. Basic cost tools often can't see or control the full cascade.
Off-hours spend goes unchecked
Most blowups happen when no one's watching. Nights, weekends. Jake's team had no real-time alerts. They got an email at 50% of monthly budget—by then the loop had already burned $7k.
Four gaps most teams miss
1. Unbounded tool use
Every tool is a potential loop. Database lookup, API, file read—if the agent can call it unlimited times, one error can spiral. Fix: hard limits per workflow, per hour, per day. No exceptions.
2. No circuit breakers
A circuit breaker stops a workflow when it hits a threshold—errors, retries, spend. Example: 5 failed retries → workflow stops, escalate to human. Few teams use these for agents. They should.
3. No per-workflow budget limits
A single monthly cap is useless. One loop can drain it in hours. You need limits per workflow, per agent, per user. E.g., support agent: $50/day max. Lead gen: $100/week. Single ticket: $0.50 max. If one workflow goes wrong, it can't take the whole budget.
4. No real-time anomaly detection
Stopping a loop means catching it as it starts. If you normally do 10 calls/minute and suddenly hit 100, the system should flag it, alert, and optionally pause. Monthly reports are too late.