Messages (Anthropic Format)

Create a message using the Anthropic Messages API format. The gateway normalizes and routes the request to any configured provider.

POST /v1/messages

Required capability: chat

Request Body

Parameter	Type	Required	Description
`model`	`string`	Yes	Model identifier (e.g. `claude-3-opus-20240229`, `gpt-4`).
`messages`	`array`	Yes	Array of message objects (minimum 1). See Message Format.
`max_tokens`	`integer`	Yes	Maximum number of tokens to generate. Required in the Anthropic format.
`system`	`string \| array`	No	System prompt. Can be a string or an array of `{"type": "text", "text": "..."}` objects.
`temperature`	`number`	No	Sampling temperature between 0 and 1.
`top_p`	`number`	No	Nucleus sampling between 0 and 1.
`stop_sequences`	`string[]`	No	Custom stop sequences.
`stream`	`boolean`	No	If `true`, responses are streamed as server-sent events. Defaults to `false`.
`tools`	`array`	No	Tool definitions. Each has `name`, optional `description`, and `input_schema`.
`tool_choice`	`object`	No	Tool selection: `{"type": "auto"}`, `{"type": "any"}`, or `{"type": "tool", "name": "..."}`.
`metadata`	`object`	No	Optional metadata. Supports `user_id` for end-user tracking.

Message Format

Field	Type	Required	Description
`role`	`string`	Yes	Either `user` or `assistant`.
`content`	`string \| array`	Yes	A string or an array of content blocks.

Content blocks use a discriminated union on the type field:

Type	Fields	Description
`text`	`text`	Plain text content.
`image`	`source: {type: "base64", media_type, data}`	Base64-encoded image.
`tool_use`	`id`, `name`, `input`	A tool call from the assistant.
`tool_result`	`tool_use_id`, `content`	The result of a tool call.

Response

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello! How can I help you today?"
    }
  ],
  "model": "claude-3-opus-20240229",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 9
  }
}

Streaming

Set "stream": true to receive Anthropic-format server-sent events. The response uses Content-Type: text/event-stream.

Events are emitted in this order:

message_start — contains the message metadata.
content_block_start — signals the beginning of a content block.
content_block_delta — incremental text deltas.
content_block_stop — signals the end of the content block.
message_delta — contains stop_reason and output usage.
message_stop — signals the end of the message.

event: message_start
data: {"type":"message_start","message":{"id":"msg_abc123","type":"message","role":"assistant","content":[],"model":"claude-3-opus-20240229","usage":{"input_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":9}}

event: message_stop
data: {"type":"message_stop"}

Streaming Limitations (Known)

The current streaming implementation has several documented limitations — clients should not depend on the values listed here being meaningful:

usage.input_tokens on message_start is always 0. The real input token count is not known until the upstream provider returns final usage data, which arrives near the end of the stream. The final accurate usage is included in the message_delta event’s usage.output_tokens (and recorded server-side for billing), but the input_tokens value emitted at message_start is a hardcoded placeholder.
stop_reason is always end_turn in message_delta. The route emits stop_reason: 'end_turn' regardless of the actual finish reason reported by the upstream provider. If you need the real finish reason, use the non-streaming endpoint.
Tool-call streaming is not supported. The streaming path does not emit content_block_start of type tool_use and does not emit input_json_delta events. Tool calls are silently dropped from the streamed output. If your use case depends on streamed tool calls, use the non-streaming endpoint or the OpenAI-format /v1/chat/completions endpoint.
Validation failures use the OpenAI error envelope, not the Anthropic envelope. HTTP 400 responses for malformed /v1/messages requests are emitted by the global error handler as {error: {code, message, details}} (OpenAI shape), not as {type: "error", error: {type, message}} (Anthropic shape). Runtime errors (timeouts, upstream failures, internal errors) do use the Anthropic shape — the inconsistency is between validation and runtime error paths. This is a known issue.

Headers

The response includes:

Header	Description
`X-Request-ID`	Unique identifier for the request, useful for debugging and audit trails.
`X-Provider`	The provider the request was routed to.
`X-Model`	The resolved model that served the request.

Example

curl https://your-gateway.example.com/v1/messages \
  -H "Authorization: Bearer aigw_sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-opus-20240229",
    "max_tokens": 1024,
    "system": "You are a helpful assistant.",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Streaming Example

curl https://your-gateway.example.com/v1/messages \
  -H "Authorization: Bearer aigw_sk_your_api_key" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "claude-3-opus-20240229",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Tell me a joke."}],
    "stream": true
  }'

Error Format

Errors from the Messages endpoint use the Anthropic error format:

{
  "type": "error",
  "error": {
    "type": "server_error",
    "message": "Detailed error message"
  }
}

Gateway Features

This endpoint passes through the same gateway middleware pipeline as /v1/chat/completions, including request normalization, RAG injection, prompt guards, budget checks, semantic cache, usage tracking, and audit logging. The gateway automatically translates between the Anthropic message format and its internal unified format, enabling routing to any provider regardless of the request format used.