Completions (Legacy)

Generate a single text completion using the legacy OpenAI text-completion response shape. Internally the gateway runs the same chat pipeline as /v1/chat/completions — the request validator is the OpenAI chat schema (openAIChatRequestSchema) — but the response is reshaped into the older text_completion object so legacy clients keep working.

POST /v1/completions

Required capability: completions

Request Body

The request body is the OpenAI chat-completion schema. See Chat Completions — Request Body for the full field reference; everything documented there is accepted here.

In practice you will set messages to a single user message wrapping the text you want completed:

{
  "model": "gpt-3.5-turbo-instruct",
  "messages": [
    { "role": "user", "content": "Once upon a time" }
  ],
  "max_tokens": 64,
  "temperature": 0.7
}

Response

The response uses the legacy text_completion object shape — choices[].text instead of choices[].message.content:

{
  "id": "cmpl-abc123",
  "object": "text_completion",
  "created": 1709000000,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "text": " there was a small village...",
      "index": 0,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 7,
    "completion_tokens": 64,
    "total_tokens": 71
  }
}

Response Fields

Field	Type	Description
`id`	`string`	Request identifier, prefixed with `cmpl-`.
`object`	`string`	Always `"text_completion"`.
`created`	`integer`	Unix timestamp of response generation.
`model`	`string`	The model that produced the response (after routing).
`choices`	`array`	Generated completions. The current implementation returns a single choice.
`choices[].text`	`string`	The generated text.
`choices[].index`	`integer`	Position of this choice (currently always `0`).
`choices[].finish_reason`	`string`	Why generation stopped (e.g. `stop`, `length`).
`usage`	`object`	Token accounting in the legacy field names (`prompt_tokens` / `completion_tokens` / `total_tokens`).

Headers

Header	Description
`X-Request-ID`	Unique identifier for the request.
`X-Provider`	The provider slug that handled the request after routing.
`X-Model`	The actual model name used (may differ from the requested model if a routing rule remapped it).

Example

curl https://your-gateway.example.com/v1/completions \
  -H "Authorization: Bearer aigw_sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "messages": [{ "role": "user", "content": "Translate to French: Hello, world." }],
    "max_tokens": 32,
    "temperature": 0
  }'

Gateway Features

The legacy completions endpoint passes through the same gateway middleware pipeline as /v1/chat/completions: authentication, capability scoping, rate limiting, request normalization, RAG injection, prompt guards, budget checks, semantic cache, usage tracking, and audit logging.

Error Format

Runtime errors from this endpoint use a minimal OpenAI-style envelope without a code field:

{
  "error": {
    "message": "Detailed error message",
    "type": "server_error"
  }
}

Validation failures (HTTP 400) are emitted by the global error handler and use the canonical envelope {error: {code, message, details}}. See Errors for the full envelope reference.