Vector Stores

Vector stores are named, queryable collections of file-derived chunks plus embeddings. The gateway owns the chunking, embedding, and search pipeline — no per-tenant Mongo Atlas Vector Search index needed; cosine similarity is computed in-app over a discriminated subset of the existing DocumentChunk collection.

Required capability: vector-stores

Endpoints

Method	Path	Description
`POST`	`/v1/vector_stores`	Create a store (optionally seeded with file IDs).
`GET`	`/v1/vector_stores`	List stores with cursor pagination.
`GET`	`/v1/vector_stores/:id`	Fetch a single store with its file and chunk counts.
`PATCH`	`/v1/vector_stores/:id`	Rename, change TTL, update metadata.
`DELETE`	`/v1/vector_stores/:id`	Delete the store and cascade its chunks.
`POST`	`/v1/vector_stores/:id/files`	Attach more files to an existing store.
`POST`	`/v1/vector_stores/:id/search`	Run a cosine-similarity search.

Create

curl https://your-gateway.example.com/v1/vector_stores \
  -H "Authorization: Bearer aigw_sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "engineering-handbook",
    "file_ids": ["file-7c8b...", "file-9d2a..."],
    "embedding_model": "nomic-embed-text",
    "expires_in_seconds": 2592000
  }'

Request fields

Field	Type	Required	Description
`name`	`string`	Yes	Store display name.
`file_ids`	`string[]`	No	Gateway file IDs to seed. Each file is downloaded, chunked, embedded, and persisted as scoped chunks.
`embedding_model`	`string`	No	Defaults to the tenant’s semantic-cache embedding model.
`expires_in_seconds`	`integer`	No	TTL for the store (and its chunks).
`metadata`	`object`	No	Free-form tenant metadata.

Response

{
  "id": "vs-7c8b9d3e-...",
  "object": "vector_store",
  "name": "engineering-handbook",
  "embedding_model": "nomic-embed-text",
  "file_counts": { "total": 2 },
  "chunk_count": 87,
  "status": "in_progress",
  "created_at": 1748284800,
  "expires_at": 1750876800,
  "metadata": null
}

status transitions from in_progress to completed once every constituent file has finished chunking and embedding.

Attach files

curl -X POST https://your-gateway.example.com/v1/vector_stores/vs-abc/files \
  -H "Authorization: Bearer aigw_sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "file_ids": ["file-new-..."] }'

Search

curl -X POST https://your-gateway.example.com/v1/vector_stores/vs-abc/search \
  -H "Authorization: Bearer aigw_sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "query": "how do we rotate encryption keys?", "top_k": 5, "threshold": 0.5 }'

Field	Type	Required	Description
`query`	`string`	Yes	The search text; embedded with the store’s embedding model.
`top_k`	`integer`	No	Maximum chunks to return, between `1` and `50`. Defaults to `5`.
`threshold`	`number`	No	Minimum cosine-similarity score, between `0` and `1`. Chunks below this are dropped.

Returns ranked chunks scoped to the store:

{
  "object": "list",
  "data": [
    {
      "file_id": "file-7c8b9d3e-...",
      "document_id": "65fe...",
      "chunk_index": 12,
      "content": "Key rotation runs daily at 02:00 UTC...",
      "score": 0.8721
    }
  ]
}

Chat-side `file_search` tool

When you include a file_search built-in tool in a chat request with vector_store_ids referencing gateway-owned stores, the gateway performs the retrieval automatically and injects the top chunks as system context before dispatching to the provider:

{
  "model": "gpt-4o",
  "messages": [{"role":"user","content":"How do we rotate keys?"}],
  "tools": [{
    "type": "built_in",
    "built_in": "file_search",
    "config": { "vectorStoreIds": ["vs-7c8b..."], "topK": 5 }
  }]
}

The file_search tool is stripped from the downstream request so the provider doesn’t attempt its own retrieval against its own vector-store namespace.

File requirements

A file can only be attached to a vector store if the gateway can download its bytes from the source provider. Today that means the provider’s adapter implements downloadFile() — OpenAI and Anthropic do; other providers’ files surface as a per-file warn log and are skipped (the store still reaches completed if any file succeeds).