API Reference¶
Complete reference for all shekel APIs.
budget()¶
Context manager for tracking and enforcing LLM API budgets.
Signature¶
def budget(
max_usd: float | None = None,
warn_at: float | None = None,
on_warn: Callable[[float, float], None] | None = None,
price_per_1k_tokens: dict[str, float] | None = None,
fallback: dict[str, Any] | None = None,
on_fallback: Callable[[float, float, str], None] | None = None,
name: str | None = None,
max_llm_calls: int | None = None,
) -> Budget
Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
max_usd |
float \| None |
None |
Maximum spend in USD. None = track-only mode (no enforcement). Must be positive if set. |
warn_at |
float \| None |
None |
Fraction (0.0-1.0) of max_usd at which to warn. |
on_warn |
Callable[[float, float], None] \| None |
None |
Callback fired at warn_at threshold. Receives (spent, limit). |
price_per_1k_tokens |
dict[str, float] \| None |
None |
Override pricing: {"input": X, "output": Y} per 1k tokens. |
fallback |
dict[str, Any] \| None |
None |
Dict specifying when and what model to switch to: {"at_pct": 0.8, "model": "gpt-4o-mini"}. at_pct is the fraction of max_usd at which to switch; model is the fallback model (same provider only). Fallback shares the same max_usd budget — there is no separate ceiling. |
on_fallback |
Callable[[float, float, str], None] \| None |
None |
Callback on fallback switch. Receives (spent, limit, fallback_model). |
name |
str \| None |
None |
Budget name for debugging and cost attribution. Required when nesting budgets. |
max_llm_calls |
int \| None |
None |
Maximum number of LLM API calls. Raises BudgetExceededError when exceeded. Can be combined with max_usd. |
Returns¶
Budget object that can be used as a context manager.
Examples¶
Track-Only Mode¶
Budget Enforcement¶
Early Warning¶
Custom Warning Callback¶
def my_handler(spent: float, limit: float):
print(f"Alert: ${spent:.2f} / ${limit:.2f}")
with budget(max_usd=10.00, warn_at=0.8, on_warn=my_handler):
run_agent()
Model Fallback¶
Call-Count Budget¶
Combined USD and Call-Count Budget¶
with budget(max_usd=1.00, max_llm_calls=20, fallback={"at_pct": 0.8, "model": "gpt-4o-mini"}) as b:
run_agent()
Accumulating Budget¶
# Budget variables accumulate across uses
session = budget(max_usd=10.00, name="session")
with session:
process_batch_1()
with session:
process_batch_2() # Accumulates automatically
print(f"Total: ${session.spent:.2f}")
Custom Pricing¶
Nested Budgets¶
with budget(max_usd=10.00, name="workflow") as workflow:
# Research stage: $2 budget
with budget(max_usd=2.00, name="research"):
run_research()
# Analysis stage: $5 budget
with budget(max_usd=5.00, name="analysis"):
run_analysis()
# Parent can spend too
finalize()
print(f"Total: ${workflow.spent:.2f}")
print(workflow.tree())
See full nested budgets guide →
Budget Class¶
The budget context manager object.
Properties¶
| Property | Type | Description |
|---|---|---|
spent |
float |
Total USD spent in this budget context (includes children in nested budgets). |
remaining |
float \| None |
Remaining USD budget (based on effective limit), or None if track-only mode. |
limit |
float \| None |
Effective budget limit (auto-capped if nested), or None if track-only. |
name |
str \| None |
Budget name. |
model_switched |
bool |
True if fallback was activated. |
switched_at_usd |
float \| None |
USD spent when fallback occurred, or None. |
fallback_spent |
float |
USD spent on the fallback model. |
Nested Budget Properties¶
| Property | Type | Description |
|---|---|---|
parent |
Budget \| None |
Parent budget, or None if root budget. |
children |
list[Budget] |
List of child budgets created under this budget. |
active_child |
Budget \| None |
Currently active child budget, or None. |
full_name |
str |
Hierarchical path name (e.g., "workflow.research.validation"). |
spent_direct |
float |
Direct spend by this budget only (excluding children). |
spent_by_children |
float |
Sum of spend from all child budgets. |
Methods¶
reset()¶
Reset spend tracking to zero. Only works when budget is not active.
session = budget(max_usd=10.00, name="session")
with session:
process()
session.reset() # Back to $0
with session:
process_again()
Raises: RuntimeError if called inside an active with block.
summary()¶
Return formatted spend summary as a string.
Returns: Multi-line string with formatted table of calls, costs, and totals.
summary_data()¶
Return structured spend data as a dict.
with budget() as b:
run_agent()
data = b.summary_data()
print(data["total_spent"])
print(data["total_calls"])
print(data["by_model"])
Returns: Dictionary with keys:
- total_spent: Total USD
- limit: Budget limit
- model_switched: Boolean
- switched_at_usd: Switch point
- fallback_model: Fallback model name
- fallback_spent: Cost on fallback
- total_calls: Number of API calls
- calls: List of all call records
- by_model: Aggregated stats per model
tree()¶
Return visual hierarchy of budget tree with spend breakdown.
with budget(max_usd=20, name="workflow") as w:
with budget(max_usd=5, name="research"):
research()
with budget(max_usd=10, name="analysis"):
analyze()
print(w.tree())
# workflow: $12.50 / $20.00 (direct: $0.00)
# research: $3.20 / $5.00 (direct: $3.20)
# analysis: $9.30 / $10.00 (direct: $9.30)
Returns: Multi-line string with indented tree structure showing:
- Budget name and hierarchy
- Total spend / limit
- Direct spend (excluding children)
- [ACTIVE] marker for currently active children
@with_budget¶
Decorator that wraps functions with a budget context.
Signature¶
def with_budget(
max_usd: float | None = None,
warn_at: float | None = None,
on_warn: Callable[[float, float], None] | None = None,
price_per_1k_tokens: dict[str, float] | None = None,
fallback: dict[str, Any] | None = None,
on_fallback: Callable[[float, float, str], None] | None = None,
max_llm_calls: int | None = None,
)
Parameters¶
Same as budget() (decorator creates fresh budget per call).
Examples¶
Basic Decorator¶
from shekel import with_budget
@with_budget(max_usd=0.50)
def generate_summary(text: str) -> str:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Summarize: {text}"}],
)
return response.choices[0].message.content
Async Decorator¶
@with_budget(max_usd=0.50)
async def async_generate(prompt: str) -> str:
response = await client.chat.completions.create(...)
return response.choices[0].message.content
With All Parameters¶
@with_budget(
max_usd=2.00,
warn_at=0.8,
fallback={"at_pct": 0.8, "model": "gpt-4o-mini"},
on_warn=my_warning_handler
)
def process_request(data: dict) -> str:
...
BudgetExceededError¶
Exception raised when budget limit is exceeded.
Attributes¶
| Attribute | Type | Description |
|---|---|---|
spent |
float |
Total USD spent when limit was hit. |
limit |
float |
The configured max_usd. |
model |
str |
Model that triggered the error. |
tokens |
dict[str, int] |
Token counts: {"input": N, "output": N}. |
Example¶
from shekel import budget, BudgetExceededError
try:
with budget(max_usd=0.50):
expensive_operation()
except BudgetExceededError as e:
print(f"Spent: ${e.spent:.4f}")
print(f"Limit: ${e.limit:.2f}")
print(f"Model: {e.model}")
print(f"Tokens: {e.tokens['input']} in, {e.tokens['output']} out")
Type Signatures¶
For type checking with mypy, pyright, etc:
from shekel import budget, with_budget, BudgetExceededError
from typing import Callable
# Budget context manager
b: budget = budget(max_usd=1.00)
# Decorator
@with_budget(max_usd=0.50)
def my_func() -> str:
...
# Callbacks
def warn_callback(spent: float, limit: float) -> None:
...
def fallback_callback(spent: float, limit: float, fallback: str) -> None:
...
Next Steps¶
- Basic Usage - Learn the fundamentals
- Nested Budgets - Hierarchical tracking for multi-stage workflows
- Budget Enforcement - Hard caps and warnings
- Fallback Models - Automatic model switching