OpenAI-Compatible

WebWright API

A drop-in replacement for the OpenAI chat completions API. Change one line of code and start shipping with Wren R1 — 2x faster and ~88% cheaper than Claude Sonnet 4.6.

Version 2026-01-01 Base URL: https://webwright.ai/v1

Authentication

All API requests require a valid API key sent via the Authorization header using the Bearer scheme. API keys use the format www_v1_[hash]. Generate your key from the WebWright dashboard.

Request header

Authorization: Bearer www_v1_YOUR_API_KEY

cURL example

curl https://webwright.ai/v1/models \
  -H "Authorization: Bearer www_v1_YOUR_API_KEY"

Note: API keys are tied to your account and plan. Requests without a valid key receive a 401 Unauthorized response.

Endpoints

The WebWright API mirrors the OpenAI API surface. The following endpoints are available.

GET /v1/models

List the available models. Returns a single model entry for the currently deployed inference model.

Response schema

{
  "object": "list",
  "data": [
    {
      "id": "wren-r1",
      "object": "model",
      "created": 1719014400,
      "owned_by": "webwright"
    }
  ]
}

cURL example

curl https://webwright.ai/v1/models \
  -H "Authorization: Bearer www_v1_YOUR_API_KEY"

POST /v1/chat/completions

Create a chat completion. Accepts the standard OpenAI chat payload. Supports both buffered (JSON) and streaming (SSE) responses.

Request body

{
  "model": "wren-r1",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the meaning of life?"
    }
  ],
  "max_tokens": 1024,
  "temperature": 0.7,
  "stream": false
}

Response (non-streaming)

{
  "id": "chatcmpl-9a8b7c6d5e4f3a2b1c0d9e8f7a6b5c4d",
  "object": "chat.completion",
  "created": 1719014400,
  "model": "wren-r1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "42 — but don't forget to ask the right question first."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 14,
    "total_tokens": 39
  }
}

Models

WebWright currently serves a single production model. Additional models will be added over time.

Model ID	Description	Context	Input / 1M tokens	Output / 1M tokens
`wren-r1`	Primary inference model — 2x faster than Claude Sonnet 4.6	128K	$0.40	$1.60

Note: The model field in chat completion requests is currently advisory — the single deployed model is always used. This will change as additional models are added.

Code Examples

The WebWright API is fully compatible with the OpenAI SDK. Change the base URL and API key — everything else works identically.

cURL — Interactive

Model Streaming (SSE)

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
  base_url="https://webwright.ai/v1",
  api_key="www_v1_YOUR_API_KEY",
)

completion = client.chat.completions.create(
  model="wren-r1",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Write a haiku about APIs."},
  ],
  max_tokens=256,
  temperature=0.7,
)

print(completion.choices[0].message.content)

TypeScript (openai SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://webwright.ai/v1",
  apiKey: "www_v1_YOUR_API_KEY",
});

const completion = await client.chat.completions.create({
  model: "wren-r1",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Write a haiku about APIs." },
  ],
  max_tokens: 256,
  temperature: 0.7,
});

console.log(completion.choices[0].message.content);

Python — Streaming

from openai import OpenAI

client = OpenAI(
  base_url="https://webwright.ai/v1",
  api_key="www_v1_YOUR_API_KEY",
)

stream = client.chat.completions.create(
  model="wren-r1",
  messages=[
    {"role": "user", "content": "Count from 1 to 5."},
  ],
  stream=True,
)

for chunk in stream:
  delta = chunk.choices[0].delta.content or ""
  print(delta, end="", flush=True)

Error Codes

Errors follow the OpenAI error schema and include a machine-readable type field.

Status	Error Type	Description
400	`invalid_request_error`	Malformed request body or invalid parameters.
401	`invalid_request_error`	Missing or invalid API key.
402	`insufficient_quota`	Account has exceeded its token quota or has an unpaid overage.
429	`rate_limit_exceeded`	Too many requests. Respect the `Retry-After` header.
500	`api_error`	Internal server error. Retry with exponential backoff.

Error response shape

{
  "error": {
    "code": null,
    "message": "Insufficient quota. Please upgrade your plan.",
    "param": null,
    "type": "insufficient_quota"
  }
}

Rate Limiting

Rate limits are enforced per API key and are based on your plan tier. When a rate limit is exceeded, the API returns a 429 Too Many Requests response with a Retry-After header indicating the number of seconds to wait before retrying.

Plan	Rate Limit	Burst
Hobby	10 RPM	2 concurrent
Pro	500 RPM	20 concurrent
Enterprise	Custom	Custom