Shekel¶
The missing safety layer for production AI agents.
The Story¶
I spent $47 debugging a LangGraph retry loop. The agent kept failing, LangGraph kept retrying, and OpenAI kept charging — all while I slept.
I built shekel so you don't have to learn that lesson yourself.
Every serious AI agent needs the same things: spending limits, loop detection, velocity guards, per-component circuit breakers. Shekel is the library that provides all of it, in one import, with zero setup.
What Shekel Does¶
-
Hard Budget Caps
Wrap any agent in a context manager. Shekel intercepts every LLM call, tracks exact spend, and raises
BudgetExceededErrorthe moment you cross the limit. No SDK changes. No config. -
Automatic Model Fallback
Don't crash — switch. Define a cheaper fallback model and shekel transparently rewrites the
modelparameter when the threshold is hit. -
Nested Budgets
Break multi-stage pipelines into per-stage budgets. Children auto-cap to the parent's remaining balance.
b.tree()gives you a live visual breakdown. -
Tool Call Budgets
Cap agent tool dispatches before they spiral. Auto-intercepted for LangChain, MCP, CrewAI, and OpenAI Agents SDK. One decorator for plain Python.
-
Loop Guard
Catches stuck agents before they drain your budget. A per-tool rolling-window counter fires
AgentLoopErrorwhen the same tool repeats too many times — no matter what the total spend is. -
Velocity Circuit Breaker
Cap how fast you burn money, not just how much. A bursty agent can blow through
max_usdbefore you can react —max_velocitystops it in seconds. -
Temporal (Rolling-Window) Budgets
Enforce
$5/hrper user, per API tier, or per agent. The string DSL handles multi-cap windows.BudgetExceededErrorcarriesretry_afterso callers know when the window resets. -
Distributed Budgets
Enforce shared limits atomically across multiple processes, workers, or containers. One Lua script per call — no race conditions.
-
Framework Circuit Breakers
Per-node, per-chain, per-agent, and per-task caps. Shekel patches your framework transparently — zero changes to your LangGraph graphs, CrewAI crews, or LangChain chains.
-
CLI — Zero Code Changes
Run any Python agent under a budget from the command line. CI-friendly exit codes. Works with Docker, GitHub Actions, and shell scripts.
-
OpenTelemetry Metrics
9 OTel instruments covering cost, utilization, spend rate, fallback activations, and loop events. Compatible with Prometheus, Grafana, Datadog, and any OTel backend.
-
Langfuse Integration
Per-call cost streaming, circuit-break events, and budget hierarchy in Langfuse spans — see exactly where your budget breaks.
Works with Everything¶
If it calls OpenAI or Anthropic under the hood, shekel sees it — zero integration code needed.
| Provider | Framework | |
|---|---|---|
| OpenAI · Anthropic · Gemini | LangChain · LangGraph | Auto-patched |
| HuggingFace · LiteLLM · Groq | CrewAI · OpenAI Agents SDK | Auto-patched |
| MCP · AutoGen · LlamaIndex | Any custom wrapper | Auto-patched |
Quick Start¶
Install¶
Your first budget¶
from shekel import budget, BudgetExceededError
try:
with budget(max_usd=1.00, warn_at=0.8) as b:
run_my_agent()
print(f"Spent: ${b.spent:.4f}")
except BudgetExceededError as e:
print(f"Budget exceeded: ${e.spent:.4f} / ${e.limit:.2f}")
No API keys. No external services. No background threads. Nothing leaves your machine.
Why Shekel?¶
Production AI agents fail in predictable ways. Shekel is designed around those failure modes:
| Failure Mode | How Shekel Stops It |
|---|---|
| Retry loop runs overnight | max_usd hard cap |
| Same tool called 500 times in a loop | loop_guard=True |
| Agent bursts $40 in two minutes | max_velocity="$1/min" |
| One LangGraph node consumes the entire budget | b.node("name", max_usd=X) |
| Expensive model blows your budget mid-task | fallback={"model": "gpt-4o-mini"} |
| Multi-tenant API needs per-user rate limits | Redis backend + temporal budgets |
| CI pipeline needs cost enforcement | shekel run agent.py --budget 5 |
All of these work together. max_usd + loop_guard + max_velocity are independent guards that fire on whichever condition triggers first.
The Spend Summary¶
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
shekel spend summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Total: $1.2450 / $5.00 (25%)
gpt-4o: $1.1320 (5 calls)
Input: 45.2k tokens → $0.1130
Output: 11.3k tokens → $1.1320
Tool spend: $0.1130 (9 tool calls)
web_search $0.090 (9 calls) [langchain]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Or machine-readable for log pipelines:
shekel run agent.py --budget 5 --output json
# {"spent": 1.245, "limit": 5.0, "calls": 5, "tool_calls": 9, "status": "ok"}
Dive Deeper¶
-
Step-by-step examples — tracking, enforcement, fallback, nested budgets, async, streaming.
-
Deep-dives into every feature: enforcement modes, tool budgets, loop guard, velocity, temporal budgets.
-
Framework-specific guides for LangGraph, CrewAI, LangChain, OpenAI Agents SDK, Gemini, and more.
-
Complete documentation of all parameters, properties, methods, and exceptions.
Supported Models¶
Built-in pricing for GPT-4o, GPT-4o-mini, o1, o3, Claude 3.5/3/3.7 Sonnet, Claude 3 Haiku/Opus, Gemini 2.0/2.5 Flash/Pro, and more.
shekel models # list all bundled models and pricing
shekel models --provider openai
shekel estimate --model gpt-4o --input-tokens 1000 --output-tokens 500
Install shekel[all-models] for 400+ models via tokencost. See full model list →
What's New¶
See CHANGELOG for the full release history.
v1.1.0 — Loop Guard, Spend Velocity, OpenAI Agents SDK per-agent circuit breaking v1.0.2 — LangGraph node caps, CrewAI agent/task caps, LangChain chain caps, Redis distributed budgets v0.2.9 — CLI shekel run; Docker support v0.2.8 — Tool budgets, temporal budgets, OpenTelemetry metrics
Community¶
- GitHub: arieradle/shekel
- PyPI: pypi.org/project/shekel
- Issues & PRs: github.com/arieradle/shekel/issues
- Contributing: See the guide
MIT License