Vector Stores
Vector stores are named, queryable collections of file-derived chunks
plus embeddings. The gateway owns the chunking, embedding, and search
pipeline — no per-tenant Mongo Atlas Vector Search index needed; cosine
similarity is computed in-app over a discriminated subset of the
existing DocumentChunk collection.
Required capability: vector-stores
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /v1/vector_stores | Create a store (optionally seeded with file IDs). |
GET | /v1/vector_stores | List stores with cursor pagination. |
GET | /v1/vector_stores/:id | Fetch a single store with its file and chunk counts. |
PATCH | /v1/vector_stores/:id | Rename, change TTL, update metadata. |
DELETE | /v1/vector_stores/:id | Delete the store and cascade its chunks. |
POST | /v1/vector_stores/:id/files | Attach more files to an existing store. |
POST | /v1/vector_stores/:id/search | Run a cosine-similarity search. |
Create
curl https://your-gateway.example.com/v1/vector_stores \ -H "Authorization: Bearer aigw_sk_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "name": "engineering-handbook", "file_ids": ["file-7c8b...", "file-9d2a..."], "embedding_model": "nomic-embed-text", "expires_in_seconds": 2592000 }'Request fields
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Store display name. |
file_ids | string[] | No | Gateway file IDs to seed. Each file is downloaded, chunked, embedded, and persisted as scoped chunks. |
embedding_model | string | No | Defaults to the tenant’s semantic-cache embedding model. |
expires_in_seconds | integer | No | TTL for the store (and its chunks). |
metadata | object | No | Free-form tenant metadata. |
Response
{ "id": "vs-7c8b9d3e-...", "object": "vector_store", "name": "engineering-handbook", "embedding_model": "nomic-embed-text", "file_counts": { "total": 2 }, "chunk_count": 87, "status": "in_progress", "created_at": 1748284800, "expires_at": 1750876800, "metadata": null}status transitions from in_progress to completed once every
constituent file has finished chunking and embedding.
Attach files
curl -X POST https://your-gateway.example.com/v1/vector_stores/vs-abc/files \ -H "Authorization: Bearer aigw_sk_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "file_ids": ["file-new-..."] }'Search
curl -X POST https://your-gateway.example.com/v1/vector_stores/vs-abc/search \ -H "Authorization: Bearer aigw_sk_your_api_key" \ -H "Content-Type: application/json" \ -d '{ "query": "how do we rotate encryption keys?", "top_k": 5, "threshold": 0.5 }'| Field | Type | Required | Description |
|---|---|---|---|
query | string | Yes | The search text; embedded with the store’s embedding model. |
top_k | integer | No | Maximum chunks to return, between 1 and 50. Defaults to 5. |
threshold | number | No | Minimum cosine-similarity score, between 0 and 1. Chunks below this are dropped. |
Returns ranked chunks scoped to the store:
{ "object": "list", "data": [ { "file_id": "file-7c8b9d3e-...", "document_id": "65fe...", "chunk_index": 12, "content": "Key rotation runs daily at 02:00 UTC...", "score": 0.8721 } ]}Chat-side file_search tool
When you include a file_search built-in tool in a chat request with
vector_store_ids referencing gateway-owned stores, the gateway performs
the retrieval automatically and injects the top chunks as system context
before dispatching to the provider:
{ "model": "gpt-4o", "messages": [{"role":"user","content":"How do we rotate keys?"}], "tools": [{ "type": "built_in", "built_in": "file_search", "config": { "vectorStoreIds": ["vs-7c8b..."], "topK": 5 } }]}The file_search tool is stripped from the downstream request so the
provider doesn’t attempt its own retrieval against its own vector-store
namespace.
File requirements
A file can only be attached to a vector store if the gateway can
download its bytes from the source provider. Today that means the
provider’s adapter implements downloadFile() — OpenAI and Anthropic
do; other providers’ files surface as a per-file warn log and are
skipped (the store still reaches completed if any file succeeds).