AgentGuard

AgentGuard is PRYSM’s spend-control layer. It turns “please don’t let the agent burn $50 on a typo” into a declarative guarantee: per-request cost caps, monthly budgets, and hard blocks on models you never want an agent to reach for.

AgentGuard rules are guardrails — in routing precedence they run after model selection and always win over routing preferences. A cost cap or a compliance block can’t be overridden by a rule or even a hard model lock.

Per-request cost caps

Set max_cost_per_request in your BRAIN.md (or pass it inline). If the model PRYSM would pick costs more than the cap, PRYSM downgrades the request to a budget model (deepseek-v4-flash) instead of failing.

BRAIN.md

max_cost_per_request: 0.005   # USD

How the estimate works. PRYSM compares a fast pre-flight estimate — (input_price + output_price) × 0.001, i.e. the blended per-MTok price at a ~1K-token reference — against your cap. It’s deliberately conservative and computed before the call, so you’re protected up front rather than billed and apologized to afterward.

Use /route (or client.route(...)) to preview the estimated cost of a prompt before sending it — handy for testing that your cap behaves the way you expect.

Example

BRAIN.md

max_cost_per_request: 0.005
rules:
  - when: "writing"
    model: "claude-sonnet-4.5"   # est. ~0.018 > 0.005

A writing prompt matches the rule, but claude-sonnet-4.5’s estimate exceeds the cap, so AgentGuard downgrades it to deepseek-v4-flash. To honor the preference, raise the cap to at least the model’s estimate or remove it.

Monthly budgets

Set a soft monthly ceiling to get alerted as you approach it:

BRAIN.md

monthly_budget: 50.00   # USD — alerts on approach

Track actual spend any time via /usage, which returns total_cost_usd plus breakdowns by model, provider, and mode.

Blocking models

List models an agent must never use — for cost, latency, or compliance reasons. A blocked model is rerouted through your fallback chain:

BRAIN.md

blocked:
  - gpt-5.2-pro      # too expensive
  - grok-4.1-heavy   # too slow

fallback:
  - deepseek-v4-flash
  - claude-haiku-4.5
  - gpt-5-nano

If routing (or a rule, or even a model lock) selects a blocked model, PRYSM walks the fallback list and routes to the first allowed, available model. If you don’t define a fallback, PRYSM uses a sensible default order.

Putting it together

A typical agent config combines all three: pin good defaults with rules, cap runaway spend per request, and block the models you never want touched.

BRAIN.md

# Prefer cheap-but-great models, never exceed a cent per call,
# and keep frontier models off the table entirely.
max_cost_per_request: 0.01
monthly_budget: 100.00

rules:
  - when: "code"
    model: "deepseek-v4-flash"
  - when: "writing"
    model: "claude-haiku-4.5"

blocked:
  - gpt-5.2-pro
  - claude-opus-4.6
  - grok-4.1-heavy

fallback:
  - deepseek-v4-flash
  - claude-haiku-4.5

Why it matters for agents

Autonomous agents make many calls without a human in the loop. One mis-routed batch on a frontier model can dwarf a month of normal usage. AgentGuard makes the cost ceiling a property of your repository — reviewed in pull requests, enforced on every request, and impossible for a prompt to talk its way around.

Configure it in BRAIN.md

Caps, budgets, blocks, and fallback chains — all version-controlled.

Audit spend with /usage

Totals and per-model / per-provider / per-mode breakdowns.

Get Started

Core Concepts

SDKs & Tools

Guides

Reference

Per-request cost caps

Example

Monthly budgets

Blocking models

Putting it together

Why it matters for agents

Configure it in BRAIN.md

Audit spend with /usage

​Per-request cost caps

​Example

​Monthly budgets

​Blocking models

​Putting it together

​Why it matters for agents

Configure it in BRAIN.md

Audit spend with /usage

Per-request cost caps

Example

Monthly budgets

Blocking models

Putting it together

Why it matters for agents