Orchestrate

from prysm import Prysm

client = Prysm()
r = client.orchestrate(
    "Compare three database designs for a 40-person team",
    policy="depth",
)
print(r["choices"][0]["message"]["content"])
print(r["prysm"]["orchestration"]["models_used"])

import { Prysm } from "@prysmai/sdk";

const client = new Prysm();
const r = await client.orchestrate(
  "Compare three database designs for a 40-person team",
  { policy: "depth" },
);
console.log(r.choices[0].message.content);
console.log(r.prysm.orchestration.models_used);

curl https://api.prysm1.com/v2/orchestrate \
  -H "Authorization: Bearer $PRYSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "Compare three database designs for a 40-person team" }
    ],
    "policy": "depth"
  }'

{
  "id": "prysm-a1b2c3d4",
  "object": "orchestration",
  "created": 1767312000,
  "policy": "depth",
  "strategy": "ensemble_moa",
  "reason": "depth policy on an analysis-heavy prompt: proposed across diverse models, then aggregated.",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Across the three designs, the trade-offs cluster around consistency vs. operational cost..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 412,
    "completion_tokens": 933,
    "total_tokens": 1345
  },
  "prysm": {
    "orchestration": {
      "policy": "depth",
      "strategy": "ensemble_moa",
      "reason": "depth policy on an analysis-heavy prompt: proposed across diverse models, then aggregated.",
      "models_used": ["claude-sonnet-4.5", "gpt-5.2", "gemini-3.1-pro"],
      "confidence": 0.88,
      "agreement": 0.81,
      "escalated": false,
      "latency_ms": 4120
    },
    "cost": { "total_usd": 0.004812, "estimated": true },
    "proof": {
      "request_id": "a1b2c3d4-...-...",
      "timestamp": "2026-06-03T00:00:00+00:00",
      "proof_hash": "sha256:a1b2c3d4e5f60718",
      "policy": "depth",
      "strategy": "ensemble_moa",
      "models_used": ["claude-sonnet-4.5", "gpt-5.2", "gemini-3.1-pro"],
      "confidence": 0.88,
      "agreement": 0.81,
      "verifiable": true
    },
    "stages": [
      {
        "name": "propose",
        "results": [
          { "model": "claude-sonnet-4.5", "ok": true, "in_tokens": 96, "out_tokens": 280, "latency_ms": 2010, "cost_usd": 0.00128, "confidence": 0.86, "role": "proposer" },
          { "model": "gpt-5.2", "ok": true, "in_tokens": 96, "out_tokens": 305, "latency_ms": 2240, "cost_usd": 0.00161, "confidence": 0.84, "role": "proposer" },
          { "model": "gemini-3.1-pro", "ok": true, "in_tokens": 96, "out_tokens": 268, "latency_ms": 1980, "cost_usd": 0.00098, "confidence": 0.82, "role": "proposer" }
        ],
        "detail": { "k": 3 }
      },
      {
        "name": "aggregate",
        "results": [
          { "model": "claude-sonnet-4.5", "ok": true, "in_tokens": 124, "out_tokens": 80, "latency_ms": 1890, "cost_usd": 0.00094, "confidence": 0.88, "role": "aggregator" }
        ],
        "detail": { "aggregator": "claude-sonnet-4.5" }
      }
    ]
  }
}

POST https://api.prysm1.com/v2/orchestrate · Requires authentication

Where /v1/chat/completions routes a prompt to the single best model, orchestrate plans and executes it across several models, then returns one synthesized answer plus a PrysmProof v2 attesting to how robustly it was produced — which models ran and how strongly they agreed. You pick the objective with a policy; PRYSM picks the strategy (or you force one). See How orchestration works for the full model.

This endpoint lives under /v2, not /v1. The SDKs target it automatically with client.orchestrate(...).

Authorization

string

required

Your secret key as a bearer token: Bearer prysm_sk_...

Body

array

required

The conversation, OpenAI-style: a list of { "role": "user" | "assistant" | "system", "content": "..." }.

string

default:"balanced"

The objective dial: efficiency (cheapest path that clears a confidence bar), depth (cross several models in parallel for robustness), or balanced.

string

Force an execution shape instead of auto-planning: single, cascade, ensemble_moa, rank_fuse, decompose_and_route, self_consistency, or debate. Omit to let PRYSM choose from the policy and prompt.

integer

Ensemble / sample width — how many models or samples to cross for ensemble_moa, rank_fuse, and self_consistency. Defaults to a policy-appropriate value.

integer

default:"1024"

Maximum tokens per underlying model call.

number

default:"0.7"

Sampling temperature passed to the underlying models.

number

A soft budget hint, in USD. Cascades stop escalating to pricier models once the estimated spend approaches this cap.

string

Preferred aggregator/fuser model for strategies that synthesize a final answer (ensemble_moa, rank_fuse, debate). Ignored if it isn’t a known catalog model.

object

A Policy-as-Code spec that confines the run to approved providers/models. Non-compliant models are filtered out before scoring, so the engine cannot select one. Same fields as /v2/compliance/preview (provider_allowlist, jurisdiction, frameworks, certifications, data_residency, block_data_classes, require_zero_retention). When set, the response’s prysm.compliance carries the decision and prysm.proof.compliance carries the attestation.

object

A BRAIN.md config whose compliance: block applies if compliance is omitted.

boolean

default:"true"

Include the per-stage execution trace in prysm.stages. Set false for a leaner response.

Response

string

Unique orchestration id, e.g. prysm-a1b2c3d4.

string

Always orchestration.

integer

Unix timestamp (seconds).

string

The policy that ran: efficiency, balanced, or depth.

string

The strategy that ran (auto-planned or forced).

string

Plain-English explanation of why this policy/strategy was chosen.

array

OpenAI-compatible choices. The synthesized answer is choices[0].message.content.

Show choices[]

integer

Choice index (always 0).

object

{ "role": "assistant", "content": "..." }.

string

Always stop.

object

Aggregate token usage across every model call.

Show usage

integer

Total input tokens.

integer

Total output tokens.

integer

Sum of the two.

object

The orchestration extension block.

Show prysm

object

Show orchestration

string

The policy that ran.

string

The strategy that ran.

string

Why this plan was chosen.

string[]

Every model that contributed.

number

Confidence in the final answer, 0–1.

number

How strongly the models agreed, 0–1.

boolean

Whether a cascade escalated to a stronger model.

integer

Wall-clock latency in milliseconds.

object

Show cost

number

Total cost across all model calls, USD.

boolean

Always true — v2 cost is estimated from text length (~4 chars/token).

object

A verifiable PrysmProof v2. Verify it later via GET /v1/proof/{request_id}.

Show proof

string

The id to verify against.

string

ISO-8601 UTC timestamp.

string

SHA-256 over the execution stages, e.g. sha256:a1b2c3d4e5f6....

string

The policy that ran.

string

The strategy that ran.

string[]

Every model that contributed.

number

Confidence in the final answer.

number

How strongly the models agreed.

boolean

Always true.

array

Per-stage execution trace (present when include_trace is true). Each stage has a name (e.g. propose → aggregate, or subtasks → synthesize), a results array with one entry per model call (model id, ok, token counts, latency, cost, confidence, role — the raw text is omitted), and a strategy-specific detail object.

Errors

Status	`error`	Meaning
`400`	`no_messages`	`messages[]` was empty.
`401`	—	Missing or invalid API key.
`502`	`all_models_failed`	Keys are configured but every model call failed.
`502`	`orchestration_error`	The orchestrator raised while planning or executing.
`503`	`no_provider_available`	No provider API keys are configured on the server.

from prysm import Prysm

client = Prysm()
r = client.orchestrate(
    "Compare three database designs for a 40-person team",
    policy="depth",
)
print(r["choices"][0]["message"]["content"])
print(r["prysm"]["orchestration"]["models_used"])

import { Prysm } from "@prysmai/sdk";

const client = new Prysm();
const r = await client.orchestrate(
  "Compare three database designs for a 40-person team",
  { policy: "depth" },
);
console.log(r.choices[0].message.content);
console.log(r.prysm.orchestration.models_used);

curl https://api.prysm1.com/v2/orchestrate \
  -H "Authorization: Bearer $PRYSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "Compare three database designs for a 40-person team" }
    ],
    "policy": "depth"
  }'

{
  "id": "prysm-a1b2c3d4",
  "object": "orchestration",
  "created": 1767312000,
  "policy": "depth",
  "strategy": "ensemble_moa",
  "reason": "depth policy on an analysis-heavy prompt: proposed across diverse models, then aggregated.",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Across the three designs, the trade-offs cluster around consistency vs. operational cost..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 412,
    "completion_tokens": 933,
    "total_tokens": 1345
  },
  "prysm": {
    "orchestration": {
      "policy": "depth",
      "strategy": "ensemble_moa",
      "reason": "depth policy on an analysis-heavy prompt: proposed across diverse models, then aggregated.",
      "models_used": ["claude-sonnet-4.5", "gpt-5.2", "gemini-3.1-pro"],
      "confidence": 0.88,
      "agreement": 0.81,
      "escalated": false,
      "latency_ms": 4120
    },
    "cost": { "total_usd": 0.004812, "estimated": true },
    "proof": {
      "request_id": "a1b2c3d4-...-...",
      "timestamp": "2026-06-03T00:00:00+00:00",
      "proof_hash": "sha256:a1b2c3d4e5f60718",
      "policy": "depth",
      "strategy": "ensemble_moa",
      "models_used": ["claude-sonnet-4.5", "gpt-5.2", "gemini-3.1-pro"],
      "confidence": 0.88,
      "agreement": 0.81,
      "verifiable": true
    },
    "stages": [
      {
        "name": "propose",
        "results": [
          { "model": "claude-sonnet-4.5", "ok": true, "in_tokens": 96, "out_tokens": 280, "latency_ms": 2010, "cost_usd": 0.00128, "confidence": 0.86, "role": "proposer" },
          { "model": "gpt-5.2", "ok": true, "in_tokens": 96, "out_tokens": 305, "latency_ms": 2240, "cost_usd": 0.00161, "confidence": 0.84, "role": "proposer" },
          { "model": "gemini-3.1-pro", "ok": true, "in_tokens": 96, "out_tokens": 268, "latency_ms": 1980, "cost_usd": 0.00098, "confidence": 0.82, "role": "proposer" }
        ],
        "detail": { "k": 3 }
      },
      {
        "name": "aggregate",
        "results": [
          { "model": "claude-sonnet-4.5", "ok": true, "in_tokens": 124, "out_tokens": 80, "latency_ms": 1890, "cost_usd": 0.00094, "confidence": 0.88, "role": "aggregator" }
        ],
        "detail": { "aggregator": "claude-sonnet-4.5" }
      }
    ]
  }
}

Create chat completion Code

Overview

Endpoints

Agentic Control Plane

Agentic Control Plane

Authorization

Body

Response

Errors

​Authorization

​Body

​Response

​Errors

Authorization

Body

Response

Errors