Quick Start¶

This guide will get you up and running with shekel in 5 minutes.

Step 1: Install Shekel¶

Choose the installation that matches your LLM provider:

pip install shekel[openai]       # For OpenAI
pip install shekel[anthropic]    # For Anthropic
pip install shekel[litellm]      # For 100+ providers via LiteLLM
pip install shekel[all]          # For OpenAI + Anthropic + LiteLLM

Step 2: Import and Use¶

from shekel import budget, BudgetExceededError

That's it! No API keys, no configuration files, no external services.

Step 3: Track Your First Call¶

OpenAI Example¶

import openai
from shekel import budget

client = openai.OpenAI()  # Your API key from environment

# Track cost without enforcing a limit
with budget() as b:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello! How are you?"}],
    )
    print(response.choices[0].message.content)

print(f"That cost: ${b.spent:.4f}")
# Output: That cost: $0.0002

Anthropic Example¶

import anthropic
from shekel import budget

client = anthropic.Anthropic()  # Your API key from environment

with budget() as b:
    response = client.messages.create(
        model="claude-3-haiku-20240307",
        max_tokens=100,
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.content[0].text)

print(f"That cost: ${b.spent:.4f}")

Step 4: Enforce a Budget¶

Add a hard cap to prevent runaway costs:

from shekel import budget, BudgetExceededError

try:
    with budget(max_usd=0.50) as b:
        run_my_agent()  # Your agent code here
    print(f"Success! Spent: ${b.spent:.4f}")
except BudgetExceededError as e:
    print(f"Budget exceeded: {e}")
    print(f"Spent: ${e.spent:.4f} / ${e.limit:.2f}")

Step 5: Add Early Warnings¶

Get warned before you hit the limit:

with budget(max_usd=1.00, warn_at=0.8) as b:
    run_my_agent()
# Prints warning at $0.80 (80% of $1.00)

Common Patterns¶

Pattern 1: Track-Only Mode¶

No enforcement, just track spending:

with budget() as b:
    run_my_agent()
print(f"Total cost: ${b.spent:.4f}")

Pattern 2: Hard Cap with Early Warning¶

Enforce a limit with advance warning:

with budget(max_usd=2.00, warn_at=0.8) as b:
    run_my_agent()
# Warns at $1.60, raises at $2.00

Pattern 3: Fallback to Cheaper Model¶

Switch models instead of crashing:

with budget(max_usd=1.00, fallback={"at_pct": 0.8, "model": "gpt-4o-mini"}) as b:
    response = client.chat.completions.create(
        model="gpt-4o",  # Starts with expensive model
        messages=[{"role": "user", "content": "Hello"}],
    )
# Automatically switches to gpt-4o-mini at 80% of $1.00 ($0.80)

if b.model_switched:
    print(f"Switched to fallback at ${b.switched_at_usd:.4f}")

Pattern 4: Nested Budgets¶

Control costs for multi-stage workflows:

with budget(max_usd=10.00, name="workflow") as workflow:
    # Research phase: $2 budget
    with budget(max_usd=2.00, name="research"):
        research_results = search_and_analyze()

    # Processing phase: $5 budget
    with budget(max_usd=5.00, name="processing"):
        processed = process_results(research_results)

    # Finalization (parent budget)
    final = finalize(processed)

print(f"Total cost: ${workflow.spent:.2f}")
print(workflow.tree())
# workflow: $7.20 / $10.00 (direct: $0.20)
#   research: $2.00 / $2.00 (direct: $2.00)
#   processing: $5.00 / $5.00 (direct: $5.00)

Why this matters: - Each stage has its own budget limit - Children auto-capped to parent's remaining budget - Clear cost attribution per stage - Perfect for complex multi-stage agents

Learn more about nested budgets →

Pattern 5: Accumulating Sessions¶

Track spending across multiple runs:

session = budget(max_usd=5.00, name="session")

# First run
with session:
    process_batch_1()
print(f"After batch 1: ${session.spent:.4f}")

# Second run - spend accumulates
with session:
    process_batch_2()
print(f"After batch 2: ${session.spent:.4f}")

# Third run
with session:
    process_batch_3()
print(f"Total session spend: ${session.spent:.4f}")

Note: Budget variables always accumulate across uses.

Pattern 6: Decorator¶

Wrap functions with a budget:

from shekel import with_budget

@with_budget(max_usd=0.10)
def generate_summary(text: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.choices[0].message.content

# Budget enforced on every call
summary = generate_summary("Long text here...")

Pattern 6: Async Support¶

Full async/await support:

async def process_items():
    async with budget(max_usd=1.00) as b:
        for item in items:
            response = await client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[{"role": "user", "content": item}],
            )
            await process(response)
    print(f"Processed {len(items)} items for ${b.spent:.4f}")

Pattern 7: Streaming¶

Budget tracking works with streaming:

with budget(max_usd=0.50) as b:
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Count to 10"}],
        stream=True,
    )
    for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

print(f"\nStreaming cost: ${b.spent:.4f}")

Working with Frameworks¶

Shekel works automatically with any framework that uses OpenAI, Anthropic, or LiteLLM under the hood.

LiteLLM¶

Track costs across 100+ providers with a single adapter:

import litellm
from shekel import budget

with budget(max_usd=0.50) as b:
    response = litellm.completion(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
print(f"Cost: ${b.spent:.4f}")

LangGraph¶

Shekel works with LangGraph out of the box — just wrap with budget():

from shekel import budget

# Your graph definition here
app = graph.compile()

with budget(max_usd=0.50, name="my-graph") as b:
    result = app.invoke({"question": "What is 2+2?"})
print(f"Graph execution cost: ${b.spent:.4f}")

CrewAI¶

from crewai import Agent, Task, Crew
from shekel import budget

# Your agents and tasks here
crew = Crew(agents=[agent], tasks=[task])

with budget(max_usd=2.00) as b:
    result = crew.kickoff()
print(f"Crew execution cost: ${b.spent:.4f}")

Viewing Spend Summary¶

Get a detailed breakdown of your spending:

with budget(max_usd=5.00) as b:
    run_my_agent()

# Print formatted summary
print(b.summary())

Output:

┌─ Shekel Budget Summary ────────────────────────────────────┐
│ Total: $1.2340  Limit: $5.00  Calls: 15  Status: OK
├────────────────────────────────────────────────────────────┤
│  #    Model                        Input  Output      Cost
│  ────────────────────────────────────────────────────────
│  1    gpt-4o-mini                  1,200     300  $0.0003
│  2    gpt-4o-mini                  1,500     450  $0.0004
│  ...
├────────────────────────────────────────────────────────────┤
│  gpt-4o-mini: 15 calls  $1.2340
└────────────────────────────────────────────────────────────┘

CLI Tools¶

Estimate costs before making API calls:

# Estimate cost
shekel estimate --model gpt-4o --input-tokens 1000 --output-tokens 500
# Model:          gpt-4o
# Input tokens:   1,000
# Output tokens:  500
# Estimated cost: $0.007500

# List all supported models
shekel models

# Filter by provider
shekel models --provider openai
shekel models --provider anthropic

Next Steps¶

Now that you've seen the basics, dive deeper:

Nested Budgets - Hierarchical tracking for multi-stage workflows
Budget Enforcement - Learn about hard caps, warnings, and callbacks
Fallback Models - Automatic model switching
Streaming - Budget tracking for streaming responses
API Reference - Complete API documentation
Integrations - Framework-specific guides

Common Questions¶

Does shekel require API keys?¶

No. Shekel uses your existing OpenAI or Anthropic API keys. It doesn't require any additional authentication.

Does shekel send my data anywhere?¶

No. Shekel works entirely locally by monkey-patching the OpenAI and Anthropic SDKs. No data leaves your system.

What happens when I hit the budget limit?¶

By default, shekel raises a BudgetExceededError. You can catch this exception or use the fallback parameter to automatically switch to a cheaper model instead.

Can I use shekel with my custom models?¶

Yes! Use the price_per_1k_tokens parameter to provide custom pricing. See Extending Shekel for details.

Does shekel work with streaming?¶

Yes! Shekel fully supports streaming responses from both OpenAI and Anthropic.