Shekel¶

LLM budget enforcement and cost tracking for Python. One line. Zero config.

with budget(max_usd=1.00):
    run_my_agent()  # raises BudgetExceededError if spend exceeds $1.00

Or enforce from the command line — no code changes needed:

shekel run agent.py --budget 1

The Story¶

I spent $47 debugging a LangGraph retry loop. The agent kept failing, LangGraph kept retrying, and OpenAI kept charging — all while I slept.

I built shekel so you don't have to learn that lesson yourself.

Features¶

Zero Config

One line of code. No API keys, no external services, no setup.

with budget(max_usd=1.00):
    run_agent()
# or: shekel run agent.py --budget 1

Budget Enforcement

Hard caps, soft warnings, or track-only mode. You control the spend.
```
with budget(max_usd=1.00, warn_at=0.8):
    run_agent()
```

Smart Fallback

Automatically switch to cheaper models instead of crashing.

with budget(max_usd=1.00, fallback={"at_pct": 0.8, "model": "gpt-4o-mini"}):
    run_agent()

Nested Budgets

Hierarchical tracking for multi-stage workflows.

with budget(max_usd=10, name="workflow"):
    with budget(max_usd=2, name="research"):
        run_research()
    with budget(max_usd=5, name="analysis"):
        run_analysis()

Langfuse Integration

Circuit-break events, per-call spend streaming, and budget hierarchy in Langfuse — see exactly where your budget breaks.

from shekel.integrations.langfuse import LangfuseAdapter

adapter = LangfuseAdapter(client=lf)
AdapterRegistry.register(adapter)
# Automatic cost tracking and budget monitoring!

Temporal Budgets

Rolling-window spend limits — enforce $5/hr per user, API tier, or agent. Hard stop with retry_after in the error.
```
tb = budget("$5/hr", name="api-tier")
async with tb:
    response = await client.chat(...)
```
Tool Budgets

Cap agent tool calls before they bankrupt you. Auto-intercepts LangChain, MCP, CrewAI, OpenAI Agents. @tool decorator for plain Python.
```
with budget(max_tool_calls=50):
    run_agent()  # ToolBudgetExceededError on call 51
```
OpenTelemetry Metrics

Export cost, utilization, spend rate, and fallback data via OTel instruments — compatible with Prometheus, Grafana, Datadog, and any OTel backend.
```
from shekel.otel import ShekelMeter
meter = ShekelMeter()  # zero config
```
Framework Agnostic

Works with LangGraph, CrewAI, AutoGen, LlamaIndex, Haystack, and any framework that calls OpenAI, Anthropic, or LiteLLM.
Async & Streaming

Full support for async/await patterns and streaming responses.
```
async with budget(max_usd=1.00):
    await run_async_agent()
```

Quick Start¶

Installation¶

OpenAIAnthropicLiteLLM (100+ providers)BothAll Models (400+)

pip install shekel[openai]

pip install shekel[anthropic]

pip install shekel[litellm]

pip install shekel[all]

pip install shekel[all-models]

Basic Usage¶

from shekel import budget, BudgetExceededError

# Enforce a hard cap
try:
    with budget(max_usd=1.00, warn_at=0.8) as b:
        run_my_agent()
    print(f"Spent ${b.spent:.4f}")
except BudgetExceededError as e:
    print(e)

# Track spend without enforcing a limit
with budget() as b:
    run_my_agent()
print(f"Cost: ${b.spent:.4f}")

# Decorator
from shekel import with_budget

@with_budget(max_usd=0.10)
def call_llm():
    ...

See It In Action¶

import openai
from shekel import budget

client = openai.OpenAI()

with budget(max_usd=0.10) as b:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

print(f"Total cost: ${b.spent:.4f}")
print(f"Remaining: ${b.remaining:.4f}")

Why Shekel?¶

Problem	Solution
Agent retry loops drain your wallet	Hard budget caps stop runaway costs
No visibility into LLM spending	Track every API call automatically
Expensive models blow your budget	Automatic fallback to cheaper models
Need to enforce spend limits	Context manager raises on budget exceeded
Multi-step workflows need session budgets	Budgets always accumulate across runs

What's New in v0.2.9¶

CLI Budget Enforcement

Run any Python agent with a hard USD cap — zero code changes. CI-friendly exit code 1. Docker, GitHub Actions, .sh scripts.
```
shekel run agent.py --budget 5
AGENT_BUDGET_USD=5 shekel run agent.py
```
Docker & Container Guardrails

Use shekel run as a Docker entrypoint wrapper. Set AGENT_BUDGET_USD at runtime — no rebuild needed.
```
ENTRYPOINT ["shekel", "run", "agent.py"]
```

Previous: v0.2.8¶

Tool Budgets

Cap agent tool calls and charge per-tool USD. Auto-intercepts LangChain, MCP, CrewAI, and OpenAI Agents SDK. One decorator for plain Python tools.
```
with budget(max_tool_calls=50):
    run_agent()

@tool(price=0.005)
def web_search(query): ...
```
Temporal Budgets (v0.2.8)

Rolling-window spend limits with a string DSL. TemporalBudgetBackend Protocol for community-extensible backends (Redis, Postgres, etc.).
```
tb = budget("$5/hr", name="api-tier")
async with tb:
    await call_llm()
```
Window Reset Events

New on_window_reset adapter event fires lazily when a TemporalBudget window expires at __enter__ time. New shekel.budget.window_resets_total OTel counter.
BudgetExceededError enrichment

retry_after (seconds until window reset) and window_spent now available on BudgetExceededError for temporal budgets.
OpenTelemetry Metrics (v0.2.7)

ShekelMeter emits 9 OTel instruments covering per-call cost, budget utilization, spend rate, fallback activations, auto-cap events, and window resets. Silent no-op when opentelemetry-api is absent.
```
pip install shekel[otel]
```
Google Gemini (v0.2.6)

Native adapter for the google-genai SDK — enforce budgets on generate_content and streaming. Pricing bundled for Gemini 2.0 Flash, 2.5 Flash, and 2.5 Pro.
```
pip install shekel[gemini]
```
HuggingFace Inference API (v0.2.6)

Native adapter for huggingface-hub — budget enforcement for any model on the HuggingFace Inference API, sync and streaming.
```
pip install shekel[huggingface]
```

What's Next?¶

Quick Start Guide

Get up and running in 5 minutes with step-by-step examples.
Usage Guide

Learn about all the features: enforcement, fallbacks, streaming, and more.
API Reference

Complete documentation of all parameters, properties, and methods.
Integrations

See how to use shekel with LangGraph, CrewAI, and other frameworks.

Supported Models¶

Built-in pricing for GPT-4o, GPT-4o-mini, o1, Claude 3.5 Sonnet, Claude 3 Haiku, Gemini 1.5, and more.

Install shekel[litellm] to enforce hard spend limits across 100+ providers through LiteLLM's unified interface.

Install shekel[all-models] for 400+ models via tokencost.

See full model list →

Community¶

GitHub: arieradle/shekel
PyPI: pypi.org/project/shekel
Issues: github.com/arieradle/shekel/issues
Contributing: See our guide

License¶

MIT License - see LICENSE for details.