Ref-counted install/restore. apply_patches() increments refcount and on first entry calls ADAPTER_REGISTRY.install_all(). remove_patches() decrements and on last exit calls ADAPTER_REGISTRY.remove_all(). Provides shared wrapper logic: get active budget, apply fallback model rewrite, call adapter extract_tokens / wrap_stream, call _pricing.calculate_cost, record on budget, emit observability.
Implement ProviderAdapter for each SDK. Patch the relevant completion/create methods; in the wrapper delegate to _patch helpers for token extraction, cost calculation, and budget recording.
Nested with budget(): blocks only patch once (refcount). All registered adapters are patched together; optional adapters (LiteLLM, Gemini, HuggingFace) register only if their dependencies are installed.
@tool(price=...) decorator and tool() wrapper for callables. On invocation: get_active_budget(); if present, check tool limits, run function, then _record_tool_call. Framework-specific tool interception (LangChain, MCP, CrewAI, OpenAI Agents) is implemented inside the respective provider or integration layers so that agent tool dispatches are counted and capped.
Click entrypoint shekel. Commands: estimate, models, run. run loads config from --budget-file (TOML) and env (AGENT_BUDGET_USD, etc.), builds budget kwargs, then delegates to run logic.
Executes the user script (e.g. runpy.run_path or subprocess) inside a budget context built from CLI/config. Handles exit codes and optional JSON output for spend summary.