Shekel¶

LLM budget enforcement and cost tracking for Python. One line. Zero config.

with budget(max_usd=1.00):
    run_my_agent()  # raises BudgetExceededError if spend exceeds $1.00

The Story¶

I spent $47 debugging a LangGraph retry loop. The agent kept failing, LangGraph kept retrying, and OpenAI kept charging — all while I slept.

I built shekel so you don't have to learn that lesson yourself.

Features¶

Zero Config

One line of code. No API keys, no external services, no setup.
```
with budget(max_usd=1.00):
    run_agent()
```
Budget Enforcement

Hard caps, soft warnings, or track-only mode. You control the spend.
```
with budget(max_usd=1.00, warn_at=0.8):
    run_agent()
```

Smart Fallback

Automatically switch to cheaper models instead of crashing.

with budget(max_usd=1.00, fallback={"at_pct": 0.8, "model": "gpt-4o-mini"}):
    run_agent()

Nested Budgets

Hierarchical tracking for multi-stage workflows.

with budget(max_usd=10, name="workflow"):
    with budget(max_usd=2, name="research"):
        run_research()
    with budget(max_usd=5, name="analysis"):
        run_analysis()

Langfuse Integration

Circuit-break events, per-call spend streaming, and budget hierarchy in Langfuse — see exactly where your budget breaks.

from shekel.integrations.langfuse import LangfuseAdapter

adapter = LangfuseAdapter(client=lf)
AdapterRegistry.register(adapter)
# Automatic cost tracking and budget monitoring!

Framework Agnostic

Works with LangGraph, CrewAI, AutoGen, LlamaIndex, Haystack, and any framework that calls OpenAI, Anthropic, or LiteLLM.
Async & Streaming

Full support for async/await patterns and streaming responses.
```
async with budget(max_usd=1.00):
    await run_async_agent()
```

Quick Start¶

Installation¶

OpenAIAnthropicLiteLLM (100+ providers)BothAll Models (400+)

pip install shekel[openai]

pip install shekel[anthropic]

pip install shekel[litellm]

pip install shekel[all]

pip install shekel[all-models]

Basic Usage¶

from shekel import budget, BudgetExceededError

# Enforce a hard cap
try:
    with budget(max_usd=1.00, warn_at=0.8) as b:
        run_my_agent()
    print(f"Spent ${b.spent:.4f}")
except BudgetExceededError as e:
    print(e)

# Track spend without enforcing a limit
with budget() as b:
    run_my_agent()
print(f"Cost: ${b.spent:.4f}")

# Decorator
from shekel import with_budget

@with_budget(max_usd=0.10)
def call_llm():
    ...

See It In Action¶

import openai
from shekel import budget

client = openai.OpenAI()

with budget(max_usd=0.10) as b:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

print(f"Total cost: ${b.spent:.4f}")
print(f"Remaining: ${b.remaining:.4f}")

Why Shekel?¶

Problem	Solution
Agent retry loops drain your wallet	Hard budget caps stop runaway costs
No visibility into LLM spending	Track every API call automatically
Expensive models blow your budget	Automatic fallback to cheaper models
Need to enforce spend limits	Context manager raises on budget exceeded
Multi-step workflows need session budgets	Budgets always accumulate across runs

What's New in v0.2.6¶

Smart Fallback

fallback={"at_pct": 0.8, "model": "gpt-4o-mini"} — automatically switch to a cheaper model instead of crashing. Fallback shares the same max_usd budget.
Early Warning Callbacks

on_warn callback fires at warn_at threshold before the budget is exhausted.
Call-Count Budgets

max_llm_calls=50 caps by number of LLM API calls, combinable with max_usd.
LiteLLM Support

Native adapter for LiteLLM — hard budget caps and circuit-breaking across 100+ providers (Gemini, Cohere, Ollama, Azure, Bedrock…). One limit, every provider.
```
pip install shekel[litellm]
```
Google Gemini

Native adapter for the google-genai SDK — enforce budgets on generate_content and streaming. Pricing bundled for Gemini 2.0 Flash, 2.5 Flash, and 2.5 Pro.
```
pip install shekel[gemini]
```
HuggingFace Inference API

Native adapter for huggingface-hub — budget enforcement for any model on the HuggingFace Inference API, sync and streaming.
```
pip install shekel[huggingface]
```

What's Next?¶

Quick Start Guide

Get up and running in 5 minutes with step-by-step examples.
Usage Guide

Learn about all the features: enforcement, fallbacks, streaming, and more.
API Reference

Complete documentation of all parameters, properties, and methods.
Integrations

See how to use shekel with LangGraph, CrewAI, and other frameworks.

Supported Models¶

Built-in pricing for GPT-4o, GPT-4o-mini, o1, Claude 3.5 Sonnet, Claude 3 Haiku, Gemini 1.5, and more.

Install shekel[litellm] to enforce hard spend limits across 100+ providers through LiteLLM's unified interface.

Install shekel[all-models] for 400+ models via tokencost.

See full model list →

Community¶

GitHub: arieradle/shekel
PyPI: pypi.org/project/shekel
Issues: github.com/arieradle/shekel/issues
Contributing: See our guide

License¶

MIT License - see LICENSE for details.