Create chat completion

from prysm import Prysm

client = Prysm()
resp = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Write a Python quicksort"}],
)
print(resp.choices[0].message.content)

import { Prysm } from "@prysmai/sdk";

const client = new Prysm();
const resp = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Write a TypeScript quicksort" }],
});
console.log(resp.choices[0].message.content);

curl https://api.prysm1.com/v1/chat/completions \
  -H "Authorization: Bearer $PRYSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{ "role": "user", "content": "Write a Python quicksort" }]
  }'

{
  "id": "prysm-a1b2c3d4",
  "object": "chat.completion",
  "created": 1748888641,
  "model": "deepseek-v4-flash",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "def quicksort(a): ..." },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 18, "completion_tokens": 240, "total_tokens": 258 },
  "prysm": {
    "routing": {
      "mode": "balanced",
      "model_display": "DeepSeek V4 Flash",
      "provider": "deepseek",
      "reason": "Code: 95% cheaper than GPT-5.2",
      "tier": "budget",
      "signals_detected": { "code": true }
    },
    "cost": { "input_usd": 0.00000504, "output_usd": 0.0001008, "total_usd": 0.00010584 },
    "latency_ms": 740,
    "proof": {
      "request_id": "b1e7c0d2-3f4a-5b6c-7d8e-9f0a1b2c3d4e",
      "timestamp": "2026-06-02T18:24:01.123456+00:00",
      "proof_hash": "sha256:a1b2c3d4e5f6a7b8",
      "model": "DeepSeek V4 Flash",
      "provider": "deepseek",
      "mode": "balanced",
      "reason": "Code: 95% cheaper than GPT-5.2",
      "verifiable": true
    },
    "fallback": false,
    "fallback_from": null,
    "fallback_to": null
  }
}

POST https://api.prysm1.com/v1/chat/completions · Requires authentication

The main endpoint. Send OpenAI-shaped messages with model: "auto" and PRYSM routes the request to the best-value model, returning a standard completion plus a prysm block.

Authorization

string

required

Your secret key as a bearer token: Bearer prysm_sk_...

Body

object[]

required

The conversation so far. Each message has a role (system, user, or assistant) and content (string).

string

default:"auto"

"auto" to let PRYSM choose, or any catalog model ID to pin the request to a specific model. Unknown IDs fall back to a safe budget default.

integer

default:"1000"

Maximum number of tokens to generate in the completion.

number

default:"0.7"

Sampling temperature between 0 and 2. Lower is more deterministic.

boolean

default:"false"

Reserved for streaming responses.

string

Force a routing mode: quality, balanced, or agility. Omit to let PRYSM choose. PRYSM-specific.

object

An inline BRAIN.md config (normalized object) to apply to this request — rules, max_cost, blocked, fallback, and more. PRYSM-specific.

Response

Returns an OpenAI-compatible chat completion object with an added prysm block.

string

Unique completion ID, prefixed prysm-.

string

Always chat.completion.

integer

Unix timestamp (seconds) when the completion was created.

string

The catalog ID of the model that actually ran.

object[]

The generated choices.

Show choice

integer

Position in the list.

object

The assistant message: { "role": "assistant", "content": "..." }.

string

Why generation stopped, e.g. stop.

object

Token accounting: prompt_tokens, completion_tokens, total_tokens.

object

The PRYSM extension block — routing decision, cost, latency, and proof.

Show prysm

object

Show routing

string

quality, balanced, agility, or direct.

string

Human-readable model name.

string

Upstream provider that served the call.

string

Why this model was chosen.

string

budget, mid, premium, or frontier.

object

Intent signals detected in the prompt.

object

Show cost

number

Cost of input tokens, USD.

number

Cost of output tokens, USD.

number

Total request cost, USD.

integer

End-to-end latency in milliseconds.

object

The PrysmProof receipt: request_id, timestamp, proof_hash, model, provider, mode, reason, verifiable.

boolean

true if the intended model was unavailable and a fallback served the call.

string | null

Intended model, if a fallback occurred.

string | null

Model that actually served the call, if a fallback occurred.

from prysm import Prysm

client = Prysm()
resp = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Write a Python quicksort"}],
)
print(resp.choices[0].message.content)

import { Prysm } from "@prysmai/sdk";

const client = new Prysm();
const resp = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Write a TypeScript quicksort" }],
});
console.log(resp.choices[0].message.content);

curl https://api.prysm1.com/v1/chat/completions \
  -H "Authorization: Bearer $PRYSM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{ "role": "user", "content": "Write a Python quicksort" }]
  }'

{
  "id": "prysm-a1b2c3d4",
  "object": "chat.completion",
  "created": 1748888641,
  "model": "deepseek-v4-flash",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "def quicksort(a): ..." },
      "finish_reason": "stop"
    }
  ],
  "usage": { "prompt_tokens": 18, "completion_tokens": 240, "total_tokens": 258 },
  "prysm": {
    "routing": {
      "mode": "balanced",
      "model_display": "DeepSeek V4 Flash",
      "provider": "deepseek",
      "reason": "Code: 95% cheaper than GPT-5.2",
      "tier": "budget",
      "signals_detected": { "code": true }
    },
    "cost": { "input_usd": 0.00000504, "output_usd": 0.0001008, "total_usd": 0.00010584 },
    "latency_ms": 740,
    "proof": {
      "request_id": "b1e7c0d2-3f4a-5b6c-7d8e-9f0a1b2c3d4e",
      "timestamp": "2026-06-02T18:24:01.123456+00:00",
      "proof_hash": "sha256:a1b2c3d4e5f6a7b8",
      "model": "DeepSeek V4 Flash",
      "provider": "deepseek",
      "mode": "balanced",
      "reason": "Code: 95% cheaper than GPT-5.2",
      "verifiable": true
    },
    "fallback": false,
    "fallback_from": null,
    "fallback_to": null
  }
}

API reference Orchestrate

Overview

Endpoints

Agentic Control Plane

Agentic Control Plane

Create chat completion

Authorization

Body

Response

​Authorization

​Body

​Response

Authorization

Body

Response