AgentGuard rules are guardrails — in routing precedence
they run after model selection and always win over routing preferences. A cost cap
or a compliance block can’t be overridden by a rule or even a hard model lock.
Per-request cost caps
Setmax_cost_per_request in your BRAIN.md (or pass it inline).
If the model PRYSM would pick costs more than the cap, PRYSM downgrades the request
to a budget model (deepseek-v3.2) instead of failing.
BRAIN.md
(input_price + output_price) × 0.001, i.e. the blended per-MTok price at a ~1K-token
reference — against your cap. It’s deliberately conservative and computed before the
call, so you’re protected up front rather than billed and apologized to afterward.
Example
BRAIN.md
claude-sonnet-4.5’s estimate exceeds the cap,
so AgentGuard downgrades it to deepseek-v3.2. To honor the preference, raise the cap to
at least the model’s estimate or remove it.
Monthly budgets
Set a soft monthly ceiling to get alerted as you approach it:BRAIN.md
/usage, which returns
total_cost_usd plus breakdowns by model, provider, and mode.
Blocking models
List models an agent must never use — for cost, latency, or compliance reasons. A blocked model is rerouted through your fallback chain:BRAIN.md
model lock) selects a blocked model, PRYSM walks the
fallback list and routes to the first allowed, available model. If you don’t define a
fallback, PRYSM uses a sensible default order.
Putting it together
A typical agent config combines all three: pin good defaults with rules, cap runaway spend per request, and block the models you never want touched.BRAIN.md
Why it matters for agents
Autonomous agents make many calls without a human in the loop. One mis-routed batch on a frontier model can dwarf a month of normal usage. AgentGuard makes the cost ceiling a property of your repository — reviewed in pull requests, enforced on every request, and impossible for a prompt to talk its way around.Configure it in BRAIN.md
Caps, budgets, blocks, and fallback chains — all version-controlled.
Audit spend with /usage
Totals and per-model / per-provider / per-mode breakdowns.