Skip to content

Shekel

LLM budget enforcement and cost tracking for Python. One line. Zero config.

with budget(max_usd=1.00):
    run_my_agent()  # raises BudgetExceededError if spend exceeds $1.00

The Story

I spent $47 debugging a LangGraph retry loop. The agent kept failing, LangGraph kept retrying, and OpenAI kept charging — all while I slept.

I built shekel so you don't have to learn that lesson yourself.


Features

  • Zero Config


    One line of code. No API keys, no external services, no setup.

    with budget(max_usd=1.00):
        run_agent()
    
  • Budget Enforcement


    Hard caps, soft warnings, or track-only mode. You control the spend.

    with budget(max_usd=1.00, warn_at=0.8):
        run_agent()
    
  • Smart Fallback


    Automatically switch to cheaper models instead of crashing.

    with budget(max_usd=1.00, fallback={"at_pct": 0.8, "model": "gpt-4o-mini"}):
        run_agent()
    
  • Nested Budgets


    Hierarchical tracking for multi-stage workflows.

    with budget(max_usd=10, name="workflow"):
        with budget(max_usd=2, name="research"):
            run_research()
        with budget(max_usd=5, name="analysis"):
            run_analysis()
    
  • Langfuse Integration


    Circuit-break events, per-call spend streaming, and budget hierarchy in Langfuse — see exactly where your budget breaks.

    from shekel.integrations.langfuse import LangfuseAdapter
    
    adapter = LangfuseAdapter(client=lf)
    AdapterRegistry.register(adapter)
    # Automatic cost tracking and budget monitoring!
    
  • Framework Agnostic


    Works with LangGraph, CrewAI, AutoGen, LlamaIndex, Haystack, and any framework that calls OpenAI, Anthropic, or LiteLLM.

  • Async & Streaming


    Full support for async/await patterns and streaming responses.

    async with budget(max_usd=1.00):
        await run_async_agent()
    

Quick Start

Installation

pip install shekel[openai]
pip install shekel[anthropic]
pip install shekel[litellm]
pip install shekel[all]
pip install shekel[all-models]

Basic Usage

from shekel import budget, BudgetExceededError

# Enforce a hard cap
try:
    with budget(max_usd=1.00, warn_at=0.8) as b:
        run_my_agent()
    print(f"Spent ${b.spent:.4f}")
except BudgetExceededError as e:
    print(e)

# Track spend without enforcing a limit
with budget() as b:
    run_my_agent()
print(f"Cost: ${b.spent:.4f}")

# Decorator
from shekel import with_budget

@with_budget(max_usd=0.10)
def call_llm():
    ...

See It In Action

import openai
from shekel import budget

client = openai.OpenAI()

with budget(max_usd=0.10) as b:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

print(f"Total cost: ${b.spent:.4f}")
print(f"Remaining: ${b.remaining:.4f}")

Why Shekel?

Problem Solution
Agent retry loops drain your wallet Hard budget caps stop runaway costs
No visibility into LLM spending Track every API call automatically
Expensive models blow your budget Automatic fallback to cheaper models
Need to enforce spend limits Context manager raises on budget exceeded
Multi-step workflows need session budgets Budgets always accumulate across runs

What's New in v0.2.6

  • Smart Fallback


    fallback={"at_pct": 0.8, "model": "gpt-4o-mini"} — automatically switch to a cheaper model instead of crashing. Fallback shares the same max_usd budget.

  • Early Warning Callbacks


    on_warn callback fires at warn_at threshold before the budget is exhausted.

  • Call-Count Budgets


    max_llm_calls=50 caps by number of LLM API calls, combinable with max_usd.

  • LiteLLM Support


    Native adapter for LiteLLM — hard budget caps and circuit-breaking across 100+ providers (Gemini, Cohere, Ollama, Azure, Bedrock…). One limit, every provider.

    pip install shekel[litellm]
    
  • Google Gemini


    Native adapter for the google-genai SDK — enforce budgets on generate_content and streaming. Pricing bundled for Gemini 2.0 Flash, 2.5 Flash, and 2.5 Pro.

    pip install shekel[gemini]
    
  • HuggingFace Inference API


    Native adapter for huggingface-hub — budget enforcement for any model on the HuggingFace Inference API, sync and streaming.

    pip install shekel[huggingface]
    

What's Next?

  • Quick Start Guide


    Get up and running in 5 minutes with step-by-step examples.

  • Usage Guide


    Learn about all the features: enforcement, fallbacks, streaming, and more.

  • API Reference


    Complete documentation of all parameters, properties, and methods.

  • Integrations


    See how to use shekel with LangGraph, CrewAI, and other frameworks.


Supported Models

Built-in pricing for GPT-4o, GPT-4o-mini, o1, Claude 3.5 Sonnet, Claude 3 Haiku, Gemini 1.5, and more.

Install shekel[litellm] to enforce hard spend limits across 100+ providers through LiteLLM's unified interface.

Install shekel[all-models] for 400+ models via tokencost.

See full model list →


Community


License

MIT License - see LICENSE for details.