Errors
All gateway errors follow a consistent JSON structure. Understanding these error codes helps with debugging and building resilient client integrations.
Error Response Format
The gateway emits four distinct error envelope shapes depending on which middleware or route handler produced the error. The shape you observe depends on whether the failure is a validation/auth/rate-limit issue (caught by the global handler), a runtime gateway error (caught by the route’s try/catch), or a capability-scope rejection (emitted inline by the capability guard).
Canonical Envelope (Global Error Handler)
Used by validation failures, authentication failures, rate-limit rejections, budget rejections, guard rejections, and any AppError propagated via next(err). This is the shape clients should expect for the majority of 4xx responses across all endpoints, including /v1/messages:
{ "error": { "code": "ERROR_CODE", "message": "Detailed error description", "details": { "additional": "context" } }}The details object varies by error code (e.g. rate-limit errors include level, retryAfter, limit, windowMs; validation errors include issues[]). In production, sensitive fields (userId, permissions, action, resource, stack) are stripped from details before the response is sent.
Inline Route Envelope — OpenAI-format Endpoints (Runtime Errors)
/v1/chat/completions and /v1/completions catch runtime gateway errors inside the route handler and emit a minimal OpenAI-style envelope without a code field:
{ "error": { "message": "Detailed error message", "type": "server_error" }}The legacy /v1/completions endpoint omits the code entirely. Other OpenAI-shape endpoints include an AppError code when one is available, but the shape is still distinct from the global handler — type is set, details is absent.
Inline Route Envelope — Anthropic-format Endpoint (Runtime Errors)
/v1/messages catches runtime gateway errors and emits the Anthropic-shape envelope:
{ "type": "error", "error": { "type": "server_error", "message": "Detailed error description" }}Known inconsistency: validation failures on /v1/messages (HTTP 400 from Zod) flow through the global error handler and therefore use the canonical {error: {code, message, details}} envelope, not the Anthropic shape above. Anthropic-format clients that depend strictly on {type: "error", error: {...}} need to handle both shapes for this endpoint.
Capability-Guard Envelope
The capability guard middleware emits a third shape with no code when an API key lacks the required capability:
{ "error": { "message": "API key missing 'embeddings' capability", "type": "insufficient_permissions" }}HTTP status is always 403 for this envelope.
HTTP Status Codes
| Status | Meaning | When It Occurs |
|---|---|---|
400 | Bad Request | Invalid request body, missing required fields, or validation failure. |
401 | Unauthorized | Missing, invalid, expired, or revoked authentication credentials. |
403 | Forbidden | Valid credentials but insufficient permissions, suspended tenant or account. |
404 | Not Found | Requested model or resource does not exist. |
409 | Conflict | Resource already exists (e.g. duplicate tenant slug). |
410 | Gone | A referenced file or other expiring resource has expired and is no longer available. |
413 | Payload Too Large | The request body exceeds the endpoint’s size limit. |
415 | Unsupported Media Type | The request Content-Type is not accepted by the endpoint. |
422 | Unprocessable Entity | Request blocked by a guard rule (token limit, PII, injection, toxicity). |
423 | Locked | Account is locked. |
429 | Too Many Requests | Rate limit or budget limit exceeded. |
500 | Internal Server Error | Unexpected server error. |
502 | Bad Gateway | Upstream provider returned an error or all providers failed. |
503 | Service Unavailable | No provider available to handle the request. |
504 | Gateway Timeout | Upstream provider did not respond within the timeout. |
Error Code Reference
Authentication Errors (1xxx)
| Code | HTTP Status | Description |
|---|---|---|
AUTH_REQUIRED | 401 | No authentication credentials were provided. Include an Authorization: Bearer <token> header. |
AUTH_INVALID_TOKEN | 401 | The JWT token is malformed, has an invalid signature, or is missing the jti claim. |
AUTH_TOKEN_EXPIRED | 401 | The JWT token has expired. Obtain a new access token. |
AUTH_INVALID_API_KEY | 401 | The API key was not found. Verify the key is correct. |
AUTH_API_KEY_EXPIRED | 401 | The API key has passed its expiresAt date. Create a new key. |
AUTH_API_KEY_REVOKED | 401 | The API key has been revoked by an administrator. |
AUTH_FORBIDDEN | 403 | The authenticated user or API key lacks the required capability or permission. |
AUTH_ACCOUNT_LOCKED | 423 | The user account is locked (e.g. too many failed login attempts). |
AUTH_ACCOUNT_SUSPENDED | 403 | The user account has been suspended. |
AUTH_MFA_REQUIRED | 403 | The user account requires multi-factor authentication; complete the MFA challenge before retrying. |
AUTH_INVALID_CREDENTIALS | 401 | The supplied email or password is incorrect. |
AUTH_PASSWORD_POLICY | 400 | The submitted password does not meet the configured complexity or length requirements. |
AUTH_PASSWORD_REUSED | 400 | The submitted password matches one of the user’s recent passwords and cannot be reused. |
Tenant Errors (2xxx)
| Code | HTTP Status | Description |
|---|---|---|
TENANT_NOT_FOUND | 404 | The tenant associated with the API key or user could not be found. |
TENANT_SUSPENDED | 403 | The tenant has been suspended by an administrator. |
TENANT_SLUG_EXISTS | 409 | A tenant with this slug already exists. |
Gateway Errors (3xxx)
| Code | HTTP Status | Description |
|---|---|---|
GATEWAY_INVALID_FORMAT | 400 | The request body does not match the expected schema for the endpoint. |
GATEWAY_NO_PROVIDER | 503 | No provider is configured or available to handle the requested model or capability. |
GATEWAY_ALL_PROVIDERS_FAILED | 502 | The gateway tried all available providers and they all returned errors. The error message typically includes details from the last provider attempt. |
GATEWAY_PROVIDER_ERROR | 502 | The upstream provider returned an error. The message includes the provider’s error details. |
GATEWAY_TIMEOUT | 504 | The upstream provider did not respond within the configured timeout period. |
GATEWAY_MODEL_NOT_FOUND | 404 | The requested model is not configured in the gateway. Check /v1/models for available models. |
GATEWAY_CAPABILITY_NOT_SUPPORTED | 400 | The requested capability (e.g. images) is not supported by the resolved model or provider. |
GATEWAY_ROUTING_CONFIG_NOT_FOUND | 404 | A routing:<slug> model alias referenced a routing config that does not exist for this tenant. |
GATEWAY_ROUTING_CONFIG_MISMATCH | 400 | A routing:<slug> model alias referenced a routing config whose capability does not match the endpoint being called. |
Guard Errors (4xxx)
These errors are returned when a content safety guard blocks the request.
| Code | HTTP Status | Description |
|---|---|---|
GUARD_TOKEN_LIMIT | 422 | The request exceeds the configured token limit guard. |
GUARD_COST_LIMIT | 422 | The estimated cost exceeds the configured cost limit guard. |
GUARD_INJECTION_DETECTED | 422 | A prompt injection attempt was detected in the input. |
GUARD_PII_DETECTED | 422 | Personally identifiable information was detected in the request. |
GUARD_CONTENT_FILTERED | 422 | The content was blocked by a content filter rule. |
GUARD_TOXICITY_DETECTED | 422 | Toxic or harmful content was detected. |
GUARD_CUSTOM_RULE | 422 | A custom guard rule was triggered. |
Budget Errors (5xxx)
| Code | HTTP Status | Description |
|---|---|---|
BUDGET_EXCEEDED | 429 | The tenant or API key has exceeded its configured budget. Contact your administrator. |
Rate Limit Errors (6xxx)
| Code | HTTP Status | Description |
|---|---|---|
RATE_LIMIT_EXCEEDED | 429 | Too many requests. The rate limit is enforced per API key or per tenant using a sliding window. Retry after the period indicated. |
Validation Errors (7xxx)
| Code | HTTP Status | Description |
|---|---|---|
VALIDATION_ERROR | 400 | Request body failed schema validation. The message includes details about which fields are invalid. |
NOT_FOUND | 404 | The requested resource was not found. |
CONFLICT | 409 | A conflicting resource already exists. |
PAYLOAD_TOO_LARGE | 413 | The request body exceeds the endpoint’s size limit (e.g. file uploads over 512 MiB, or oversized JSON bodies). |
UNSUPPORTED_MEDIA_TYPE | 415 | The request Content-Type is not accepted by the endpoint. |
INVALID_JSON | 400 | The request body could not be parsed as JSON. Emitted by the global error handler when Express’s body parser throws a SyntaxError. |
File, Batch, Vector Store, Response & Realtime Errors (8xxx)
These errors are returned by the Phase E gateway surfaces (/v1/files, /v1/batches, /v1/vector_stores, /v1/responses, /v1/realtime).
| Code | HTTP Status | Description |
|---|---|---|
FILE_NOT_FOUND | 404 | The referenced gateway file ID does not exist for this tenant. |
FILE_PROVIDER_MISMATCH | 400 | The file (or a previous response) belongs to a different provider than the request targets. Also returned when attempting to continue a Responses thread across providers. |
FILE_EXPIRED | 410 | The file has expired and is no longer available. |
PROVIDER_NOT_CONFIGURED | 400 | The provider required to service the request is not configured for this tenant. |
BATCH_NOT_FOUND | 404 | The referenced batch job ID does not exist for this tenant. |
VECTOR_STORE_NOT_FOUND | 404 | The referenced vector store ID does not exist for this tenant. |
RESPONSE_NOT_FOUND | 404 | The referenced response session ID does not exist for this tenant. |
REALTIME_SESSION_NOT_FOUND | 404 | The referenced realtime session ID does not exist for this tenant. |
REALTIME_TICKET_INVALID | 401 | The realtime WebSocket ticket is missing, already used, or expired. The WebSocket is closed with code 4401. |
System Errors (9xxx)
| Code | HTTP Status | Description |
|---|---|---|
INTERNAL_ERROR | 500 | An unexpected internal error occurred. In production, error details are hidden. |
SERVICE_UNAVAILABLE | 503 | The gateway service is temporarily unavailable (e.g. during startup or maintenance). |
Handling Errors
Retry Strategy
For transient errors, implement exponential backoff with jitter:
- 429 (Rate Limit) — respect the rate limit window and retry after a delay.
- 502 (Bad Gateway) — the provider may be temporarily unavailable; retry with backoff.
- 503 (Service Unavailable) — no provider is available; retry after a short delay.
- 504 (Gateway Timeout) — the provider was too slow; retry, possibly with a simpler request.
Non-Retryable Errors
Do not retry these errors without changing the request:
- 400 — fix the request body.
- 401 — fix or refresh authentication credentials.
- 403 — check permissions and capabilities.
- 404 — verify the model name or resource path.
- 422 — the request was blocked by a guard; modify the content.
Streaming Errors
When an error occurs during a streaming response (stream: true), the error is sent as a final SSE event:
OpenAI format:
data: {"error":{"message":"An internal error occurred","type":"server_error"}}Anthropic format:
event: errordata: {"type":"error","error":{"type":"server_error","message":"An internal error occurred"}}In production mode, internal error details are redacted and replaced with a generic message.