OpenAI-Compatible

WebWright API

A drop-in replacement for the OpenAI chat completions API. Change one line of code and start shipping with Wren R1 — 2x faster and ~88% cheaper than Claude Sonnet 4.6.

Version 2026-01-01 Base URL: https://webwright.ai/v1

Authentication

All API requests require a valid API key sent via the Authorization header using the Bearer scheme. API keys use the format www_v1_[hash]. Generate your key from the WebWright dashboard.

Request header
Authorization: Bearer www_v1_YOUR_API_KEY
cURL example
curl https://webwright.ai/v1/models \
  -H "Authorization: Bearer www_v1_YOUR_API_KEY"

Endpoints

The WebWright API mirrors the OpenAI API surface. The following endpoints are available.

GET /v1/models

List the available models. Returns a single model entry for the currently deployed inference model.

Response schema
{
  "object": "list",
  "data": [
    {
      "id": "wren-r1",
      "object": "model",
      "created": 1719014400,
      "owned_by": "webwright"
    }
  ]
}
cURL example
curl https://webwright.ai/v1/models \
  -H "Authorization: Bearer www_v1_YOUR_API_KEY"
POST /v1/chat/completions

Create a chat completion. Accepts the standard OpenAI chat payload. Supports both buffered (JSON) and streaming (SSE) responses.

Request body
{
  "model": "wren-r1",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the meaning of life?"
    }
  ],
  "max_tokens": 1024,
  "temperature": 0.7,
  "stream": false
}
Response (non-streaming)
{
  "id": "chatcmpl-9a8b7c6d5e4f3a2b1c0d9e8f7a6b5c4d",
  "object": "chat.completion",
  "created": 1719014400,
  "model": "wren-r1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "42 — but don't forget to ask the right question first."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 14,
    "total_tokens": 39
  }
}

Models

WebWright currently serves a single production model. Additional models will be added over time.

Model IDDescriptionContextInput / 1M tokensOutput / 1M tokens
wren-r1Primary inference model — 2x faster than Claude Sonnet 4.6128K$0.40$1.60

Code Examples

The WebWright API is fully compatible with the OpenAI SDK. Change the base URL and API key — everything else works identically.

cURL — Interactive

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
  base_url="https://webwright.ai/v1",
  api_key="www_v1_YOUR_API_KEY",
)

completion = client.chat.completions.create(
  model="wren-r1",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Write a haiku about APIs."},
  ],
  max_tokens=256,
  temperature=0.7,
)

print(completion.choices[0].message.content)

TypeScript (openai SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://webwright.ai/v1",
  apiKey: "www_v1_YOUR_API_KEY",
});

const completion = await client.chat.completions.create({
  model: "wren-r1",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Write a haiku about APIs." },
  ],
  max_tokens: 256,
  temperature: 0.7,
});

console.log(completion.choices[0].message.content);

Python — Streaming

from openai import OpenAI

client = OpenAI(
  base_url="https://webwright.ai/v1",
  api_key="www_v1_YOUR_API_KEY",
)

stream = client.chat.completions.create(
  model="wren-r1",
  messages=[
    {"role": "user", "content": "Count from 1 to 5."},
  ],
  stream=True,
)

for chunk in stream:
  delta = chunk.choices[0].delta.content or ""
  print(delta, end="", flush=True)

Error Codes

Errors follow the OpenAI error schema and include a machine-readable type field.

StatusError TypeDescription
400invalid_request_errorMalformed request body or invalid parameters.
401invalid_request_errorMissing or invalid API key.
402insufficient_quotaAccount has exceeded its token quota or has an unpaid overage.
429rate_limit_exceededToo many requests. Respect the Retry-After header.
500api_errorInternal server error. Retry with exponential backoff.
Error response shape
{
  "error": {
    "code": null,
    "message": "Insufficient quota. Please upgrade your plan.",
    "param": null,
    "type": "insufficient_quota"
  }
}

Rate Limiting

Rate limits are enforced per API key and are based on your plan tier. When a rate limit is exceeded, the API returns a 429 Too Many Requests response with a Retry-After header indicating the number of seconds to wait before retrying.

PlanRate LimitBurst
Hobby10 RPM2 concurrent
Pro500 RPM20 concurrent
EnterpriseCustomCustom

Questions? Visit webwright.ai or FAQ.