model: "auto", PRYSM inspects the prompt, classifies
its intent, picks a routing mode, and selects the model that delivers the best
result for the lowest cost. The decision is returned in the response’s
prysm.routing block, so it’s never
a black box.
The three modes
PRYSM routes in one of three modes. It selects a mode automatically from the prompt, and you can force one per request withrouting_mode.
Quality
Complex, high-stakes work. Routes to premium/frontier models for maximum accuracy
and nuance.
Balanced
The default. The best single model for the task at a sensible price.
Agility
Short, simple prompts. Routes to the fastest, cheapest capable model.
How the mode is chosen
| Condition | Mode |
|---|---|
An analysis or reasoning signal and more than 20 words | Quality |
A simple signal or fewer than 8 words | Agility |
| Everything else | Balanced |
Intent signals
PRYSM detects intent from keywords and language. A prompt can match several signals at once; routing weighs them together with the word count and mode.| Signal | Fires on prompts about… |
|---|---|
code | functions, debugging, languages, APIs, SQL, regex, deploys |
write | drafting, essays, blogs, emails, copy, editing, tone |
analysis | analyze, research, compare, evaluate, strategy, reports |
math | calculations, equations, statistics, proofs, calculus |
translate | translation between languages |
realtime | today, latest, current, news, prices, live data |
simple | quick lookups, definitions, “what is”, conversions |
multimodal | images, photos, diagrams, charts, video |
reasoning | step-by-step logic, philosophy, ethics, debate, proofs |
The signals PRYSM actually detected for a request are echoed back in
prysm.routing.signals_detected, so you can always see what drove the decision.Worked examples
These show the default (no BRAIN.md) behavior:"capital of France?" → Agility
"capital of France?" → Agility
Three words and a
simple signal trigger Agility. PRYSM routes to a fast budget
model (e.g. deepseek-v3.2 at $0.28/MTok) — paying frontier prices for a fact
lookup would be waste."write a Python function to parse CSV" → Balanced
"write a Python function to parse CSV" → Balanced
A
code signal with a moderate length lands in Balanced. Short code prompts route
to deepseek-v3.2 (“95% cheaper than GPT-5.2”); longer or higher-stakes code
routes to claude-sonnet-4.5 for instruction-following and quality."analyze the trade-offs between microservices and a monolith for a 40-person team, considering …" → Quality
"analyze the trade-offs between microservices and a monolith for a 40-person team, considering …" → Quality
An
analysis signal with more than 20 words triggers Quality. PRYSM routes to a
premium reasoning model (e.g. gpt-5.2) — accuracy matters more than saving a
fraction of a cent here.Direct model selection
You don’t have to useauto. Pass any catalog model ID to pin a single
request to that model — routing is skipped and the mode is reported as direct:
Shaping routing with BRAIN.md
Auto-routing is a strong default, but you’re in control. ABRAIN.md file lets you pin models to specific signals, lock a
single model, cap per-request cost, and block models entirely — all version-controlled
alongside your code. Guardrails like cost caps and blocks always win over routing
preferences; see AgentGuard.