Skip to content

Completions (Legacy)

Generate a single text completion using the legacy OpenAI text-completion response shape. Internally the gateway runs the same chat pipeline as /v1/chat/completions — the request validator is the OpenAI chat schema (openAIChatRequestSchema) — but the response is reshaped into the older text_completion object so legacy clients keep working.

POST /v1/completions

Required capability: completions

Request Body

The request body is the OpenAI chat-completion schema. See Chat Completions — Request Body for the full field reference; everything documented there is accepted here.

In practice you will set messages to a single user message wrapping the text you want completed:

{
"model": "gpt-3.5-turbo-instruct",
"messages": [
{ "role": "user", "content": "Once upon a time" }
],
"max_tokens": 64,
"temperature": 0.7
}

Response

The response uses the legacy text_completion object shape — choices[].text instead of choices[].message.content:

{
"id": "cmpl-abc123",
"object": "text_completion",
"created": 1709000000,
"model": "gpt-4o-mini",
"choices": [
{
"text": " there was a small village...",
"index": 0,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 7,
"completion_tokens": 64,
"total_tokens": 71
}
}

Response Fields

FieldTypeDescription
idstringRequest identifier, prefixed with cmpl-.
objectstringAlways "text_completion".
createdintegerUnix timestamp of response generation.
modelstringThe model that produced the response (after routing).
choicesarrayGenerated completions. The current implementation returns a single choice.
choices[].textstringThe generated text.
choices[].indexintegerPosition of this choice (currently always 0).
choices[].finish_reasonstringWhy generation stopped (e.g. stop, length).
usageobjectToken accounting in the legacy field names (prompt_tokens / completion_tokens / total_tokens).

Headers

HeaderDescription
X-Request-IDUnique identifier for the request.
X-ProviderThe provider slug that handled the request after routing.
X-ModelThe actual model name used (may differ from the requested model if a routing rule remapped it).

Example

Terminal window
curl https://your-gateway.example.com/v1/completions \
-H "Authorization: Bearer aigw_sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo-instruct",
"messages": [{ "role": "user", "content": "Translate to French: Hello, world." }],
"max_tokens": 32,
"temperature": 0
}'

Gateway Features

The legacy completions endpoint passes through the same gateway middleware pipeline as /v1/chat/completions: authentication, capability scoping, rate limiting, request normalization, RAG injection, prompt guards, budget checks, semantic cache, usage tracking, and audit logging.

Error Format

Runtime errors from this endpoint use a minimal OpenAI-style envelope without a code field:

{
"error": {
"message": "Detailed error message",
"type": "server_error"
}
}

Validation failures (HTTP 400) are emitted by the global error handler and use the canonical envelope {error: {code, message, details}}. See Errors for the full envelope reference.