Introduction

PRYSM is a universal AI routing engine. Point your existing OpenAI code at PRYSM and every prompt is automatically routed to the best-value model for the job — across 27 models from 12 providers (OpenAI, Anthropic, Google, DeepSeek, xAI, Mistral, Moonshot, Alibaba, Perplexity, Meta, Zhipu, SambaNova) — behind one API, one key, and one bill.

Drop-in for OpenAI

Change one line — the base URL. Every OpenAI SDK call works unchanged.

Smart routing

Intent classification picks the right model for each prompt: quality, balanced, or agility.

BRAIN.md config

A declarative, version-controlled routing file — the .cursorrules of model routing.

Cost guardrails

AgentGuard caps per-request spend and blocks models you don’t want agents touching.

PrysmProof receipts

Every response carries a tamper-evident SHA-256 receipt of what ran and why.

MCP server

Give Claude Desktop, Cursor, or Windsurf cost-aware routing as a native tool.

Why PRYSM

The model you reach for by default is rarely the best one for the task in front of you. A frontier model on a one-line classification is wasted money; a budget model on a nuanced contract is wasted quality. PRYSM makes that decision per request — so you get the right model every time without hand-tuning, while spend stays predictable.

Better outputs

Each prompt goes to the model that’s actually best at it — code, writing, math, translation, reasoning, and more.

Lower cost

Cheap models handle the easy 80%; premium models are reserved for the hard 20%. Typical savings are large versus an all-premium baseline.

Zero lock-in

OpenAI-compatible. No new SDK to learn, no rewrite, no juggling nine provider keys.

The drop-in pattern

PRYSM speaks the OpenAI API. Set model: "auto" and let PRYSM choose:

from openai import OpenAI

client = OpenAI(
    api_key="prysm_sk_your_key",
    base_url="https://api.prysm1.com/v1",   # the only change
)

resp = client.chat.completions.create(
    model="auto",                          # let PRYSM pick the best model
    messages=[{"role": "user", "content": "Write a Python quicksort"}],
)
print(resp.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "prysm_sk_your_key",
  baseURL: "https://api.prysm1.com/v1",      // the only change
});

const resp = await client.chat.completions.create({
  model: "auto",                           // let PRYSM pick the best model
  messages: [{ role: "user", content: "Write a TypeScript quicksort" }],
});
console.log(resp.choices[0].message.content);

curl https://api.prysm1.com/v1/chat/completions \
  -H "Authorization: Bearer prysm_sk_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{ "role": "user", "content": "Write a Python quicksort" }]
  }'

Prefer a native client? The first-party SDKs (prysm1 for Python and @prysmai/sdk for Node) subclass the OpenAI client and add helpers for routing previews, usage, savings, and BRAIN.md auto-discovery.

What’s in every response

PRYSM returns a standard OpenAI payload plus a top-level prysm block describing the decision — which model ran, why, what it cost, latency, and a verifiable proof:

{
  "id": "prysm-a1b2c3d4",
  "object": "chat.completion",
  "model": "deepseek-v4-flash",
  "choices": [{ "message": { "role": "assistant", "content": "..." } }],
  "usage": { "prompt_tokens": 18, "completion_tokens": 240, "total_tokens": 258 },
  "prysm": {
    "routing": {
      "mode": "balanced",
      "model_display": "DeepSeek V4 Flash",
      "provider": "deepseek",
      "reason": "Code: 95% cheaper than GPT-5.2",
      "tier": "budget",
      "signals_detected": { "code": true }
    },
    "cost": { "input_usd": 0.0000025, "output_usd": 0.0000672, "total_usd": 0.0000697 },
    "latency_ms": 740,
    "proof": { "proof_hash": "sha256:a1b2c3d4e5f6a7b8", "verifiable": true }
  }
}

OpenAI-compatible clients ignore the extra prysm field. The PRYSM SDKs expose it cleanly through extension().

Next steps

Quickstart

Make your first routed request in under five minutes.

Authentication

API keys, the base URL, and environment variables.

How routing works

Modes, signals, and how PRYSM picks a model.

API reference

Every endpoint, parameter, and response field.

Building with AI agents

Hard spend ceilings, tamper-evident trajectories, and BRAIN.md for autonomous agents.

Get Started

Core Concepts

SDKs & Tools

Guides

Reference

Introduction

Drop-in for OpenAI

Smart routing

BRAIN.md config

Cost guardrails

PrysmProof receipts

MCP server

Why PRYSM

Better outputs

Lower cost

Zero lock-in

The drop-in pattern

What’s in every response

Next steps

Quickstart

Authentication

How routing works

API reference

Building with AI agents

Drop-in for OpenAI

Smart routing

BRAIN.md config

Cost guardrails

PrysmProof receipts

MCP server

​Why PRYSM

Better outputs

Lower cost

Zero lock-in

​The drop-in pattern

​What’s in every response

​Next steps

Quickstart

Authentication

How routing works

API reference

Building with AI agents

Why PRYSM

The drop-in pattern

What’s in every response

Next steps