Skip to main content
AgentGuard is PRYSM’s spend-control layer. It turns “please don’t let the agent burn $50 on a typo” into a declarative guarantee: per-request cost caps, monthly budgets, and hard blocks on models you never want an agent to reach for.
AgentGuard rules are guardrails — in routing precedence they run after model selection and always win over routing preferences. A cost cap or a compliance block can’t be overridden by a rule or even a hard model lock.

Per-request cost caps

Set max_cost_per_request in your BRAIN.md (or pass it inline). If the model PRYSM would pick costs more than the cap, PRYSM downgrades the request to a budget model (deepseek-v3.2) instead of failing.
BRAIN.md
max_cost_per_request: 0.005   # USD
How the estimate works. PRYSM compares a fast pre-flight estimate — (input_price + output_price) × 0.001, i.e. the blended per-MTok price at a ~1K-token reference — against your cap. It’s deliberately conservative and computed before the call, so you’re protected up front rather than billed and apologized to afterward.
Use /route (or client.route(...)) to preview the estimated cost of a prompt before sending it — handy for testing that your cap behaves the way you expect.

Example

BRAIN.md
max_cost_per_request: 0.005
rules:
  - when: "writing"
    model: "claude-sonnet-4.5"   # est. ~0.018 > 0.005
A writing prompt matches the rule, but claude-sonnet-4.5’s estimate exceeds the cap, so AgentGuard downgrades it to deepseek-v3.2. To honor the preference, raise the cap to at least the model’s estimate or remove it.

Monthly budgets

Set a soft monthly ceiling to get alerted as you approach it:
BRAIN.md
monthly_budget: 50.00   # USD — alerts on approach
Track actual spend any time via /usage, which returns total_cost_usd plus breakdowns by model, provider, and mode.

Blocking models

List models an agent must never use — for cost, latency, or compliance reasons. A blocked model is rerouted through your fallback chain:
BRAIN.md
blocked:
  - gpt-5.2-pro      # too expensive
  - grok-4.1-heavy   # too slow

fallback:
  - deepseek-v3.2
  - claude-haiku-4.5
  - gpt-5-nano
If routing (or a rule, or even a model lock) selects a blocked model, PRYSM walks the fallback list and routes to the first allowed, available model. If you don’t define a fallback, PRYSM uses a sensible default order.

Putting it together

A typical agent config combines all three: pin good defaults with rules, cap runaway spend per request, and block the models you never want touched.
BRAIN.md
# Prefer cheap-but-great models, never exceed a cent per call,
# and keep frontier models off the table entirely.
max_cost_per_request: 0.01
monthly_budget: 100.00

rules:
  - when: "code"
    model: "deepseek-v3.2"
  - when: "writing"
    model: "claude-haiku-4.5"

blocked:
  - gpt-5.2-pro
  - claude-opus-4.6
  - grok-4.1-heavy

fallback:
  - deepseek-v3.2
  - claude-haiku-4.5

Why it matters for agents

Autonomous agents make many calls without a human in the loop. One mis-routed batch on a frontier model can dwarf a month of normal usage. AgentGuard makes the cost ceiling a property of your repository — reviewed in pull requests, enforced on every request, and impossible for a prompt to talk its way around.

Configure it in BRAIN.md

Caps, budgets, blocks, and fallback chains — all version-controlled.

Audit spend with /usage

Totals and per-model / per-provider / per-mode breakdowns.