Provider Adapters
The adapter pattern is the core abstraction that allows the gateway to support 38 AI providers through a single unified API. Each adapter translates requests and responses between the gateway’s internal format and a provider’s native API.
The Adapter Pattern
Unified Request Provider-Specific Request(UnifiedRequest) (e.g., OpenAI format) | ^ | toProviderRequest() | +--------------------------------------->+ | Provider API | +<---------------------------------------+ | fromProviderResponse() | v vUnified Response Provider-Specific Response(UnifiedResponse)BaseProviderAdapter
All adapters extend BaseProviderAdapter, which provides:
| Method | Purpose |
|---|---|
toProviderRequest() | Convert UnifiedRequest to provider-specific request body (abstract) |
fromProviderResponse() | Convert provider response to UnifiedResponse (abstract) |
execute() | Send a non-streaming request to the provider (abstract) |
executeStream() | Send a streaming request, returning an AsyncGenerator<UnifiedStreamChunk> (abstract) |
fromProviderStreamChunk() | Transform a single SSE chunk to UnifiedStreamChunk (abstract) |
healthCheck() | Verify provider connectivity (abstract) |
listModels() | List available models from the provider (abstract) |
buildHeaders() | Construct HTTP headers with API key authentication |
httpRequest() | Make HTTP requests with timeout and SSRF validation |
parseSSEStream() | Parse Server-Sent Events from a fetch Response |
validateUrl() | Block SSRF vectors (private IPs, internal hosts) |
sanitizeProviderError() | Strip API keys from error messages |
Optional capability methods have default implementations that throw “not supported”:
executeEmbedding()— Vector embeddingsexecuteAudio()— Audio transcription/translationexecuteImageGeneration()— Image generationexecuteTextToSpeech()— Text-to-speechexecuteRerank()— Document rerankingexecuteVideoGeneration()— Video generationexecuteMusicGeneration()— Music synthesis (Stable Audio, Suno, Udio, etc.)executeOcr()— Document OCR (Mistral OCR, Google Document AI, AWS Textract)executeTokenCount()— Local token-count endpoint (Anthropic/v1/messages/count_tokens)executeModeration()— Content moderation (OpenAI / Mistral/v1/moderations)executeBatchSubmit()/executeBatchStatus()/executeBatchCancel()— Async batch lifecycleuploadFile()/deleteFile()/downloadFile()— Files API delegationexecuteResponses()— Stateful Responses API (OpenAI/v1/responses)buildRealtimeUpstream()— Provision the upstream WebSocket URL + headers for the realtime multiplexer
Supported Providers (38)
| Provider | Adapter File | Key Capabilities |
|---|---|---|
| OpenAI | openai.adapter.ts | Chat, embeddings, audio, TTS, images, vision, function calling |
| Anthropic | anthropic.adapter.ts | Chat, vision, function calling, streaming |
| Azure OpenAI | azure-openai.adapter.ts | Chat, embeddings, audio, TTS, images (OpenAI models via Azure) |
| Google Gemini | google-gemini.adapter.ts | Chat, embeddings, vision, function calling |
| Groq | groq.adapter.ts | Chat, streaming (fast inference) |
| Mistral | mistral.adapter.ts | Chat, embeddings, function calling |
| Cohere | cohere.adapter.ts | Chat, embeddings, rerank |
| DeepSeek | deepseek.adapter.ts | Chat, streaming |
| Together AI | together-ai.adapter.ts | Chat, embeddings, images |
| Fireworks | fireworks.adapter.ts | Chat, embeddings, streaming |
| Perplexity | perplexity.adapter.ts | Chat (search-augmented) |
| AI21 | ai21.adapter.ts | Chat, completions |
| HuggingFace | huggingface.adapter.ts | Chat, embeddings |
| xAI | xai.adapter.ts | Chat, streaming |
| Cerebras | cerebras.adapter.ts | Chat, streaming (fast inference) |
| SambaNova | sambanova.adapter.ts | Chat, streaming |
| Ollama | ollama.adapter.ts | Chat, embeddings (local) |
| vLLM | vllm.adapter.ts | Chat, embeddings (self-hosted) |
| LM Studio | lmstudio.adapter.ts | Chat (local, OpenAI-compatible) |
| LocalAI | localai.adapter.ts | Chat, embeddings, audio, TTS, images (local) |
| llama.cpp | llamacpp.adapter.ts | Chat (local) |
| AssemblyAI | assemblyai.adapter.ts | Audio transcription |
| ElevenLabs | elevenlabs.adapter.ts | Text-to-speech |
| Whisper Local | whisper-local.adapter.ts | Audio transcription (local) |
| Replicate | replicate.adapter.ts | Images, video generation |
| ComfyUI | comfyui.adapter.ts | Images, video generation (workflow-based) |
| Stability AI | stability.adapter.ts | Image generation, music (Stable Audio 2) |
| Moonshot | moonshot.adapter.ts | Chat, streaming (Kimi models) |
| Zhipu | zhipu.adapter.ts | Chat, streaming (GLM models) |
| DashScope | dashscope.adapter.ts | Chat, embeddings (Alibaba Qwen) |
| DeepInfra | deepinfra.adapter.ts | Chat, embeddings (OAI-compat aggregator) |
| Cloudflare | cloudflare.adapter.ts | Chat (Workers AI catalog) |
| Lambda Labs | lambda-labs.adapter.ts | Chat (OAI-compat) |
| Voyage | voyage.adapter.ts | Embeddings, rerank |
| Deepgram | deepgram.adapter.ts | Audio transcription (Nova-2/3) |
| Cartesia | cartesia.adapter.ts | TTS, speech-to-speech |
| AWS Bedrock | bedrock.adapter.ts | Chat (Mantle OAI-compat front-end) |
| Vertex AI | vertex.adapter.ts | Chat, embeddings (Google OAI-compat path) |
Request/Response Translation
Each adapter translates field names and structures. For example, the OpenAI adapter:
- Maps
UnifiedRequest.messagesto OpenAI’smessagesarray format - Converts multimodal content (images) to OpenAI’s
image_urlformat - Maps
topPtotop_p,maxTokenstomax_tokens - Adds
stream_options: { include_usage: true }for streaming requests - On response, extracts
choices[0].message.contentintoUnifiedResponse.content - Normalizes
usagefields fromprompt_tokens/completion_tokenstoinputTokens/outputTokens
Adding a New Provider
To add support for a new AI provider:
- Create the adapter file at
packages/server/src/providers/<name>.adapter.ts. - Extend
BaseProviderAdapterand implement the required abstract methods. - Set
providerTypeto a unique slug (e.g.,my-provider). - Set
supportedCapabilitiesto the list of capabilities the provider supports. - Register the adapter in
packages/server/src/providers/factory.ts. - Add tests in
tests/unit/providers/.
The minimum implementation requires:
toProviderRequest()— Map unified format to the provider’s API formatfromProviderResponse()— Map the provider’s response back to unified formatexecute()— Make the HTTP call and return the parsed responseexecuteStream()— Handle SSE streaming (useparseSSEStream()helper)fromProviderStreamChunk()— Parse individual stream chunkshealthCheck()— Call a lightweight endpoint (e.g., list models) to verify connectivitylistModels()— Return available models
Security
All adapters inherit these security measures from BaseProviderAdapter:
- SSRF prevention —
validateUrl()blocks requests to private IP ranges and internal hostnames (with an allowlist for admin-configured local providers) - API key sanitization — Error messages are scrubbed to remove API keys before logging or returning to clients
- Request timeout — All HTTP requests have configurable timeouts with abort controllers
- Retry logic — The gateway service wraps adapter calls with exponential backoff retry on transient errors (connection failures, 429, 5xx)
Next Steps
- Architecture Overview — See how adapters fit in the full request lifecycle
- Routing Strategies — How the gateway selects which adapter to use