Errors

All gateway errors follow a consistent JSON structure. Understanding these error codes helps with debugging and building resilient client integrations.

Error Response Format

The gateway emits four distinct error envelope shapes depending on which middleware or route handler produced the error. The shape you observe depends on whether the failure is a validation/auth/rate-limit issue (caught by the global handler), a runtime gateway error (caught by the route’s try/catch), or a capability-scope rejection (emitted inline by the capability guard).

Canonical Envelope (Global Error Handler)

Used by validation failures, authentication failures, rate-limit rejections, budget rejections, guard rejections, and any AppError propagated via next(err). This is the shape clients should expect for the majority of 4xx responses across all endpoints, including /v1/messages:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Detailed error description",
    "details": { "additional": "context" }
  }
}

The details object varies by error code (e.g. rate-limit errors include level, retryAfter, limit, windowMs; validation errors include issues[]). In production, sensitive fields (userId, permissions, action, resource, stack) are stripped from details before the response is sent.

Inline Route Envelope — OpenAI-format Endpoints (Runtime Errors)

/v1/chat/completions and /v1/completions catch runtime gateway errors inside the route handler and emit a minimal OpenAI-style envelope without a code field:

{
  "error": {
    "message": "Detailed error message",
    "type": "server_error"
  }
}

The legacy /v1/completions endpoint omits the code entirely. Other OpenAI-shape endpoints include an AppError code when one is available, but the shape is still distinct from the global handler — type is set, details is absent.

Inline Route Envelope — Anthropic-format Endpoint (Runtime Errors)

/v1/messages catches runtime gateway errors and emits the Anthropic-shape envelope:

{
  "type": "error",
  "error": {
    "type": "server_error",
    "message": "Detailed error description"
  }
}

Known inconsistency: validation failures on /v1/messages (HTTP 400 from Zod) flow through the global error handler and therefore use the canonical {error: {code, message, details}} envelope, not the Anthropic shape above. Anthropic-format clients that depend strictly on {type: "error", error: {...}} need to handle both shapes for this endpoint.

Capability-Guard Envelope

The capability guard middleware emits a third shape with no code when an API key lacks the required capability:

{
  "error": {
    "message": "API key missing 'embeddings' capability",
    "type": "insufficient_permissions"
  }
}

HTTP status is always 403 for this envelope.

HTTP Status Codes

Status	Meaning	When It Occurs
`400`	Bad Request	Invalid request body, missing required fields, or validation failure.
`401`	Unauthorized	Missing, invalid, expired, or revoked authentication credentials.
`403`	Forbidden	Valid credentials but insufficient permissions, suspended tenant or account.
`404`	Not Found	Requested model or resource does not exist.
`409`	Conflict	Resource already exists (e.g. duplicate tenant slug).
`410`	Gone	A referenced file or other expiring resource has expired and is no longer available.
`413`	Payload Too Large	The request body exceeds the endpoint’s size limit.
`415`	Unsupported Media Type	The request `Content-Type` is not accepted by the endpoint.
`422`	Unprocessable Entity	Request blocked by a guard rule (token limit, PII, injection, toxicity).
`423`	Locked	Account is locked.
`429`	Too Many Requests	Rate limit or budget limit exceeded.
`500`	Internal Server Error	Unexpected server error.
`502`	Bad Gateway	Upstream provider returned an error or all providers failed.
`503`	Service Unavailable	No provider available to handle the request.
`504`	Gateway Timeout	Upstream provider did not respond within the timeout.

Error Code Reference

Authentication Errors (1xxx)

Code	HTTP Status	Description
`AUTH_REQUIRED`	401	No authentication credentials were provided. Include an `Authorization: Bearer <token>` header.
`AUTH_INVALID_TOKEN`	401	The JWT token is malformed, has an invalid signature, or is missing the `jti` claim.
`AUTH_TOKEN_EXPIRED`	401	The JWT token has expired. Obtain a new access token.
`AUTH_INVALID_API_KEY`	401	The API key was not found. Verify the key is correct.
`AUTH_API_KEY_EXPIRED`	401	The API key has passed its `expiresAt` date. Create a new key.
`AUTH_API_KEY_REVOKED`	401	The API key has been revoked by an administrator.
`AUTH_FORBIDDEN`	403	The authenticated user or API key lacks the required capability or permission.
`AUTH_ACCOUNT_LOCKED`	423	The user account is locked (e.g. too many failed login attempts).
`AUTH_ACCOUNT_SUSPENDED`	403	The user account has been suspended.
`AUTH_MFA_REQUIRED`	403	The user account requires multi-factor authentication; complete the MFA challenge before retrying.
`AUTH_INVALID_CREDENTIALS`	401	The supplied email or password is incorrect.
`AUTH_PASSWORD_POLICY`	400	The submitted password does not meet the configured complexity or length requirements.
`AUTH_PASSWORD_REUSED`	400	The submitted password matches one of the user’s recent passwords and cannot be reused.

Tenant Errors (2xxx)

Code	HTTP Status	Description
`TENANT_NOT_FOUND`	404	The tenant associated with the API key or user could not be found.
`TENANT_SUSPENDED`	403	The tenant has been suspended by an administrator.
`TENANT_SLUG_EXISTS`	409	A tenant with this slug already exists.

Gateway Errors (3xxx)

Code	HTTP Status	Description
`GATEWAY_INVALID_FORMAT`	400	The request body does not match the expected schema for the endpoint.
`GATEWAY_NO_PROVIDER`	503	No provider is configured or available to handle the requested model or capability.
`GATEWAY_ALL_PROVIDERS_FAILED`	502	The gateway tried all available providers and they all returned errors. The error message typically includes details from the last provider attempt.
`GATEWAY_PROVIDER_ERROR`	502	The upstream provider returned an error. The message includes the provider’s error details.
`GATEWAY_TIMEOUT`	504	The upstream provider did not respond within the configured timeout period.
`GATEWAY_MODEL_NOT_FOUND`	404	The requested model is not configured in the gateway. Check `/v1/models` for available models.
`GATEWAY_CAPABILITY_NOT_SUPPORTED`	400	The requested capability (e.g. `images`) is not supported by the resolved model or provider.
`GATEWAY_ROUTING_CONFIG_NOT_FOUND`	404	A `routing:<slug>` model alias referenced a routing config that does not exist for this tenant.
`GATEWAY_ROUTING_CONFIG_MISMATCH`	400	A `routing:<slug>` model alias referenced a routing config whose capability does not match the endpoint being called.

Guard Errors (4xxx)

These errors are returned when a content safety guard blocks the request.

Code	HTTP Status	Description
`GUARD_TOKEN_LIMIT`	422	The request exceeds the configured token limit guard.
`GUARD_COST_LIMIT`	422	The estimated cost exceeds the configured cost limit guard.
`GUARD_INJECTION_DETECTED`	422	A prompt injection attempt was detected in the input.
`GUARD_PII_DETECTED`	422	Personally identifiable information was detected in the request.
`GUARD_CONTENT_FILTERED`	422	The content was blocked by a content filter rule.
`GUARD_TOXICITY_DETECTED`	422	Toxic or harmful content was detected.
`GUARD_CUSTOM_RULE`	422	A custom guard rule was triggered.

Budget Errors (5xxx)

Code	HTTP Status	Description
`BUDGET_EXCEEDED`	429	The tenant or API key has exceeded its configured budget. Contact your administrator.

Rate Limit Errors (6xxx)

Code	HTTP Status	Description
`RATE_LIMIT_EXCEEDED`	429	Too many requests. The rate limit is enforced per API key or per tenant using a sliding window. Retry after the period indicated.

Validation Errors (7xxx)

Code	HTTP Status	Description
`VALIDATION_ERROR`	400	Request body failed schema validation. The message includes details about which fields are invalid.
`NOT_FOUND`	404	The requested resource was not found.
`CONFLICT`	409	A conflicting resource already exists.
`PAYLOAD_TOO_LARGE`	413	The request body exceeds the endpoint’s size limit (e.g. file uploads over 512 MiB, or oversized JSON bodies).
`UNSUPPORTED_MEDIA_TYPE`	415	The request `Content-Type` is not accepted by the endpoint.
`INVALID_JSON`	400	The request body could not be parsed as JSON. Emitted by the global error handler when Express’s body parser throws a `SyntaxError`.

File, Batch, Vector Store, Response & Realtime Errors (8xxx)

These errors are returned by the Phase E gateway surfaces (/v1/files, /v1/batches, /v1/vector_stores, /v1/responses, /v1/realtime).

Code	HTTP Status	Description
`FILE_NOT_FOUND`	404	The referenced gateway file ID does not exist for this tenant.
`FILE_PROVIDER_MISMATCH`	400	The file (or a previous response) belongs to a different provider than the request targets. Also returned when attempting to continue a Responses thread across providers.
`FILE_EXPIRED`	410	The file has expired and is no longer available.
`PROVIDER_NOT_CONFIGURED`	400	The provider required to service the request is not configured for this tenant.
`BATCH_NOT_FOUND`	404	The referenced batch job ID does not exist for this tenant.
`VECTOR_STORE_NOT_FOUND`	404	The referenced vector store ID does not exist for this tenant.
`RESPONSE_NOT_FOUND`	404	The referenced response session ID does not exist for this tenant.
`REALTIME_SESSION_NOT_FOUND`	404	The referenced realtime session ID does not exist for this tenant.
`REALTIME_TICKET_INVALID`	401	The realtime WebSocket ticket is missing, already used, or expired. The WebSocket is closed with code `4401`.

System Errors (9xxx)

Code	HTTP Status	Description
`INTERNAL_ERROR`	500	An unexpected internal error occurred. In production, error details are hidden.
`SERVICE_UNAVAILABLE`	503	The gateway service is temporarily unavailable (e.g. during startup or maintenance).

Handling Errors

Retry Strategy

For transient errors, implement exponential backoff with jitter:

429 (Rate Limit) — respect the rate limit window and retry after a delay.
502 (Bad Gateway) — the provider may be temporarily unavailable; retry with backoff.
503 (Service Unavailable) — no provider is available; retry after a short delay.
504 (Gateway Timeout) — the provider was too slow; retry, possibly with a simpler request.

Non-Retryable Errors

Do not retry these errors without changing the request:

400 — fix the request body.
401 — fix or refresh authentication credentials.
403 — check permissions and capabilities.
404 — verify the model name or resource path.
422 — the request was blocked by a guard; modify the content.

Streaming Errors

When an error occurs during a streaming response (stream: true), the error is sent as a final SSE event:

OpenAI format:

data: {"error":{"message":"An internal error occurred","type":"server_error"}}

Anthropic format:

event: error
data: {"type":"error","error":{"type":"server_error","message":"An internal error occurred"}}

In production mode, internal error details are redacted and replaced with a generic message.