Completions (Legacy)
Generate a single text completion using the legacy OpenAI text-completion response shape. Internally the gateway runs the same chat pipeline as /v1/chat/completions — the request validator is the OpenAI chat schema (openAIChatRequestSchema) — but the response is reshaped into the older text_completion object so legacy clients keep working.
POST /v1/completionsRequired capability: completions
Request Body
The request body is the OpenAI chat-completion schema. See Chat Completions — Request Body for the full field reference; everything documented there is accepted here.
In practice you will set messages to a single user message wrapping the text you want completed:
{ "model": "gpt-3.5-turbo-instruct", "messages": [ { "role": "user", "content": "Once upon a time" } ], "max_tokens": 64, "temperature": 0.7}Response
The response uses the legacy text_completion object shape — choices[].text instead of choices[].message.content:
{ "id": "cmpl-abc123", "object": "text_completion", "created": 1709000000, "model": "gpt-4o-mini", "choices": [ { "text": " there was a small village...", "index": 0, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 7, "completion_tokens": 64, "total_tokens": 71 }}Response Fields
| Field | Type | Description |
|---|---|---|
id | string | Request identifier, prefixed with cmpl-. |
object | string | Always "text_completion". |
created | integer | Unix timestamp of response generation. |
model | string | The model that produced the response (after routing). |
choices | array | Generated completions. The current implementation returns a single choice. |
choices[].text | string | The generated text. |
choices[].index | integer | Position of this choice (currently always 0). |
choices[].finish_reason | string | Why generation stopped (e.g. stop, length). |
usage | object | Token accounting in the legacy field names (prompt_tokens / completion_tokens / total_tokens). |
Headers
| Header | Description |
|---|---|
X-Request-ID | Unique identifier for the request. |
X-Provider | The provider slug that handled the request after routing. |
X-Model | The actual model name used (may differ from the requested model if a routing rule remapped it). |
Example
curl https://your-gateway.example.com/v1/completions \ -H "Authorization: Bearer aigw_sk_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-3.5-turbo-instruct", "messages": [{ "role": "user", "content": "Translate to French: Hello, world." }], "max_tokens": 32, "temperature": 0 }'Gateway Features
The legacy completions endpoint passes through the same gateway middleware pipeline as /v1/chat/completions: authentication, capability scoping, rate limiting, request normalization, RAG injection, prompt guards, budget checks, semantic cache, usage tracking, and audit logging.
Error Format
Runtime errors from this endpoint use a minimal OpenAI-style envelope without a code field:
{ "error": { "message": "Detailed error message", "type": "server_error" }}Validation failures (HTTP 400) are emitted by the global error handler and use the canonical envelope {error: {code, message, details}}. See Errors for the full envelope reference.