Skip to content

Vector Stores

Vector stores are named, queryable collections of file-derived chunks plus embeddings. The gateway owns the chunking, embedding, and search pipeline — no per-tenant Mongo Atlas Vector Search index needed; cosine similarity is computed in-app over a discriminated subset of the existing DocumentChunk collection.

Required capability: vector-stores

Endpoints

MethodPathDescription
POST/v1/vector_storesCreate a store (optionally seeded with file IDs).
GET/v1/vector_storesList stores with cursor pagination.
GET/v1/vector_stores/:idFetch a single store with its file and chunk counts.
PATCH/v1/vector_stores/:idRename, change TTL, update metadata.
DELETE/v1/vector_stores/:idDelete the store and cascade its chunks.
POST/v1/vector_stores/:id/filesAttach more files to an existing store.
POST/v1/vector_stores/:id/searchRun a cosine-similarity search.

Create

Terminal window
curl https://your-gateway.example.com/v1/vector_stores \
-H "Authorization: Bearer aigw_sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "engineering-handbook",
"file_ids": ["file-7c8b...", "file-9d2a..."],
"embedding_model": "nomic-embed-text",
"expires_in_seconds": 2592000
}'

Request fields

FieldTypeRequiredDescription
namestringYesStore display name.
file_idsstring[]NoGateway file IDs to seed. Each file is downloaded, chunked, embedded, and persisted as scoped chunks.
embedding_modelstringNoDefaults to the tenant’s semantic-cache embedding model.
expires_in_secondsintegerNoTTL for the store (and its chunks).
metadataobjectNoFree-form tenant metadata.

Response

{
"id": "vs-7c8b9d3e-...",
"object": "vector_store",
"name": "engineering-handbook",
"embedding_model": "nomic-embed-text",
"file_counts": { "total": 2 },
"chunk_count": 87,
"status": "in_progress",
"created_at": 1748284800,
"expires_at": 1750876800,
"metadata": null
}

status transitions from in_progress to completed once every constituent file has finished chunking and embedding.

Attach files

Terminal window
curl -X POST https://your-gateway.example.com/v1/vector_stores/vs-abc/files \
-H "Authorization: Bearer aigw_sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{ "file_ids": ["file-new-..."] }'
Terminal window
curl -X POST https://your-gateway.example.com/v1/vector_stores/vs-abc/search \
-H "Authorization: Bearer aigw_sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{ "query": "how do we rotate encryption keys?", "top_k": 5, "threshold": 0.5 }'
FieldTypeRequiredDescription
querystringYesThe search text; embedded with the store’s embedding model.
top_kintegerNoMaximum chunks to return, between 1 and 50. Defaults to 5.
thresholdnumberNoMinimum cosine-similarity score, between 0 and 1. Chunks below this are dropped.

Returns ranked chunks scoped to the store:

{
"object": "list",
"data": [
{
"file_id": "file-7c8b9d3e-...",
"document_id": "65fe...",
"chunk_index": 12,
"content": "Key rotation runs daily at 02:00 UTC...",
"score": 0.8721
}
]
}

Chat-side file_search tool

When you include a file_search built-in tool in a chat request with vector_store_ids referencing gateway-owned stores, the gateway performs the retrieval automatically and injects the top chunks as system context before dispatching to the provider:

{
"model": "gpt-4o",
"messages": [{"role":"user","content":"How do we rotate keys?"}],
"tools": [{
"type": "built_in",
"built_in": "file_search",
"config": { "vectorStoreIds": ["vs-7c8b..."], "topK": 5 }
}]
}

The file_search tool is stripped from the downstream request so the provider doesn’t attempt its own retrieval against its own vector-store namespace.

File requirements

A file can only be attached to a vector store if the gateway can download its bytes from the source provider. Today that means the provider’s adapter implements downloadFile() — OpenAI and Anthropic do; other providers’ files surface as a per-file warn log and are skipped (the store still reaches completed if any file succeeds).