Skip to main content
POST https://api.prysm1.com/v2/orchestrate · Requires authentication
Where /v1/chat/completions routes a prompt to the single best model, orchestrate plans and executes it across several models, then returns one synthesized answer plus a PrysmProof v2 attesting to how robustly it was produced — which models ran and how strongly they agreed. You pick the objective with a policy; PRYSM picks the strategy (or you force one). See How orchestration works for the full model.
This endpoint lives under /v2, not /v1. The SDKs target it automatically with client.orchestrate(...).

Authorization

Authorization
string
required
Your secret key as a bearer token: Bearer prysm_sk_...

Body

messages
array
required
The conversation, OpenAI-style: a list of { "role": "user" | "assistant" | "system", "content": "..." }.
policy
string
default:"balanced"
The objective dial: efficiency (cheapest path that clears a confidence bar), depth (cross several models in parallel for robustness), or balanced.
strategy
string
Force an execution shape instead of auto-planning: single, cascade, ensemble_moa, rank_fuse, decompose_and_route, self_consistency, or debate. Omit to let PRYSM choose from the policy and prompt.
k
integer
Ensemble / sample width — how many models or samples to cross for ensemble_moa, rank_fuse, and self_consistency. Defaults to a policy-appropriate value.
max_tokens
integer
default:"1024"
Maximum tokens per underlying model call.
temperature
number
default:"0.7"
Sampling temperature passed to the underlying models.
max_cost_usd
number
A soft budget hint, in USD. Cascades stop escalating to pricier models once the estimated spend approaches this cap.
judge_model
string
Preferred aggregator/fuser model for strategies that synthesize a final answer (ensemble_moa, rank_fuse, debate). Ignored if it isn’t a known catalog model.
compliance
object
A Policy-as-Code spec that confines the run to approved providers/models. Non-compliant models are filtered out before scoring, so the engine cannot select one. Same fields as /v2/compliance/preview (provider_allowlist, jurisdiction, frameworks, certifications, data_residency, block_data_classes, require_zero_retention). When set, the response’s prysm.compliance carries the decision and prysm.proof.compliance carries the attestation.
brain_config
object
A BRAIN.md config whose compliance: block applies if compliance is omitted.
include_trace
boolean
default:"true"
Include the per-stage execution trace in prysm.stages. Set false for a leaner response.

Response

id
string
Unique orchestration id, e.g. prysm-a1b2c3d4.
object
string
Always orchestration.
created
integer
Unix timestamp (seconds).
policy
string
The policy that ran: efficiency, balanced, or depth.
strategy
string
The strategy that ran (auto-planned or forced).
reason
string
Plain-English explanation of why this policy/strategy was chosen.
choices
array
OpenAI-compatible choices. The synthesized answer is choices[0].message.content.
usage
object
Aggregate token usage across every model call.
prysm
object
The orchestration extension block.

Errors

StatuserrorMeaning
400no_messagesmessages[] was empty.
401Missing or invalid API key.
502all_models_failedKeys are configured but every model call failed.
502orchestration_errorThe orchestrator raised while planning or executing.
503no_provider_availableNo provider API keys are configured on the server.
from prysm import Prysm

client = Prysm()
r = client.orchestrate(
    "Compare three database designs for a 40-person team",
    policy="depth",
)
print(r["choices"][0]["message"]["content"])
print(r["prysm"]["orchestration"]["models_used"])
{
  "id": "prysm-a1b2c3d4",
  "object": "orchestration",
  "created": 1767312000,
  "policy": "depth",
  "strategy": "ensemble_moa",
  "reason": "depth policy on an analysis-heavy prompt: proposed across diverse models, then aggregated.",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Across the three designs, the trade-offs cluster around consistency vs. operational cost..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 412,
    "completion_tokens": 933,
    "total_tokens": 1345
  },
  "prysm": {
    "orchestration": {
      "policy": "depth",
      "strategy": "ensemble_moa",
      "reason": "depth policy on an analysis-heavy prompt: proposed across diverse models, then aggregated.",
      "models_used": ["claude-sonnet-4.5", "gpt-5.2", "gemini-3.1-pro"],
      "confidence": 0.88,
      "agreement": 0.81,
      "escalated": false,
      "latency_ms": 4120
    },
    "cost": { "total_usd": 0.004812, "estimated": true },
    "proof": {
      "request_id": "a1b2c3d4-...-...",
      "timestamp": "2026-06-03T00:00:00+00:00",
      "proof_hash": "sha256:a1b2c3d4e5f60718",
      "policy": "depth",
      "strategy": "ensemble_moa",
      "models_used": ["claude-sonnet-4.5", "gpt-5.2", "gemini-3.1-pro"],
      "confidence": 0.88,
      "agreement": 0.81,
      "verifiable": true
    },
    "stages": [
      {
        "name": "propose",
        "results": [
          { "model": "claude-sonnet-4.5", "ok": true, "in_tokens": 96, "out_tokens": 280, "latency_ms": 2010, "cost_usd": 0.00128, "confidence": 0.86, "role": "proposer" },
          { "model": "gpt-5.2", "ok": true, "in_tokens": 96, "out_tokens": 305, "latency_ms": 2240, "cost_usd": 0.00161, "confidence": 0.84, "role": "proposer" },
          { "model": "gemini-3.1-pro", "ok": true, "in_tokens": 96, "out_tokens": 268, "latency_ms": 1980, "cost_usd": 0.00098, "confidence": 0.82, "role": "proposer" }
        ],
        "detail": { "k": 3 }
      },
      {
        "name": "aggregate",
        "results": [
          { "model": "claude-sonnet-4.5", "ok": true, "in_tokens": 124, "out_tokens": 80, "latency_ms": 1890, "cost_usd": 0.00094, "confidence": 0.88, "role": "aggregator" }
        ],
        "detail": { "aggregator": "claude-sonnet-4.5" }
      }
    ]
  }
}