Responses

The Responses API mirrors OpenAI’s /v1/responses surface and threads continuations through gateway-issued response IDs. The gateway translates previous_response_id from its own ID space to the underlying provider’s ID before dispatching, then persists the full response output blocks so later continuations work even after the provider’s own retention window rotates.

Required capability: responses

Endpoints

Method	Path	Description
`POST`	`/v1/responses`	Create a new response or continuation.
`GET`	`/v1/responses`	List responses.
`GET`	`/v1/responses/:responseId`	Fetch a stored response.
`POST`	`/v1/responses/:responseId/cancel`	Cancel an in-progress response.
`DELETE`	`/v1/responses/:responseId`	Remove a stored response.

Create

curl https://your-gateway.example.com/v1/responses \
  -H "Authorization: Bearer aigw_sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "input": "What did you tell me yesterday about quantum tunneling?",
    "previous_response_id": "resp-abc-..."
  }'

Request fields

Field	Type	Required	Description
`model`	`string`	Yes	Model id to call.
`input`	`string \| array`	Yes	Single string or an array of input items.
`provider`	`string`	No	Override the routing chain. Default: chat routing config’s first provider for the model, falling back to `openai`.
`previous_response_id`	`string`	No	Gateway response ID to continue from.
`instructions`	`string`	No	System-style instructions.
`tools`	`array`	No	Function tools or built-in tools (`file_search`, `web_search`, `code_execution`, `computer_use`, `search_grounding`).
`tool_choice`	`string \| object`	No	`auto` / `none` / `required` / `{type, function}`.
`temperature`	`number`	No	0–2.
`top_p`	`number`	No	0–1.
`max_output_tokens`	`integer`	No
`store`	`boolean`	No	Persist for later retrieval. Default: `true`.
`stream`	`boolean`	No	SSE streaming. Default: `false`.
`reasoning`	`object`	No	`{effort: "minimal"\|"low"\|"medium"\|"high"}` or `{budget_tokens}` for thinking models.
`metadata`	`object`	No	Free-form tenant metadata.

Response

{
  "id": "resp-7c8b9d3e-...",
  "object": "response",
  "provider": "openai",
  "provider_response_id": "resp_OAI_xyz",
  "model": "gpt-4o",
  "previous_response_id": "resp-abc-...",
  "status": "completed",
  "output": [ /* provider-native output blocks */ ],
  "usage": { "input_tokens": 124, "output_tokens": 312, "total_tokens": 436 },
  "created_at": 1748284800,
  "completed_at": 1748284805,
  "metadata": null
}

Cross-provider continuation

Continuation across different providers is not supported and returns FILE_PROVIDER_MISMATCH — providers track context differently and there’s no portable protocol for thread state. A continuation request must hit the same provider that produced the previous response.

Ephemeral responses

Set "store": false to skip persistence — the gateway returns the provider’s output but doesn’t write a ResponseSession record. Useful for high-throughput stateless workflows where you don’t need history.