Create Routing Rules
Routing rules determine which providers handle incoming requests and in what order. This tutorial walks you through creating a routing configuration using the priority strategy, then testing it with an API call.
Prerequisites
- At least one provider configured and healthy (see Set Up Your First Provider)
- Admin dashboard access
Step 1 — Navigate to the Routing Page
- In the sidebar, click Routing.
- The routing list shows all routing configurations for your tenant as a flat list of cards.
- Click Create Route.
Step 2 — Set Basic Configuration
- Name — Enter a descriptive name, for example
Chat Primary Route. - Slug (optional) — A lowercase-hyphenated identifier. If set, clients can target this config directly with the model string
routing:<slug>. - Capabilities — Select one or more request types this config handles (multi-select). Choose
chatfor chat completion requests. - Strategy — Select
priorityto start. This strategy orders routes by theirprioritynumber. See Routing Strategies for all ten options. - Enabled — Toggle on.
(A config can also be marked as the tenant default via the API field isDefault; there is no toggle for it in the create form.)
Step 3 — Add Provider Entries
Each route entry maps a provider to this routing configuration.
- Click Add Route.
- Provider — Select your provider from the dropdown (e.g.,
openai-prod). - Model ID — Enter the default model for this route, for example
gpt-4o. - Priority — Enter
1(lower number = higher priority). This provider will be tried first. - Weight — Enter
1. Weight is used by theweightedstrategy; forprioritystrategy it has no effect. - Enabled — Toggle on.
To add a fallback provider:
- Click Add Route again.
- Select a second provider (e.g.,
anthropic-prod). - Set Priority to
2. This provider is tried only if the first fails. - Set Model ID to the equivalent model on this provider, for example
claude-sonnet-4-20250514.
Step 4 — Configure the Fallback Chain (Optional)
The fallback chain provides a last-resort option after all route entries have been exhausted.
- Scroll to the Fallback Chain section.
- Add a provider entry with a model ID. This provider is appended after all strategy-ordered routes.
- You can also enable a Local Fallback (e.g., an Ollama instance) for complete offline resilience.
The routing service detects circular fallback chains and breaks them automatically.
Step 5 — Ensure Models Are Registered
Model filtering happens automatically — there is no separate UI step. When a client requests a specific model (via the model field), the routing service only considers routes whose provider has a matching, enabled ModelConfig. Providers without the requested model are skipped.
To ensure correct filtering:
- Verify each provider in your routes has the relevant models registered in the Models tab.
- Models must be marked as
enabledto be eligible.
Step 6 — Test with an API Call
Send a test request to verify routing works:
curl -X POST http://localhost:3000/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}] }'The response sets X-Provider and X-Model headers indicating which provider and model actually served the request:
X-Provider: openaiX-Model: gpt-4oIf the highest-priority route fails, the gateway transparently falls through to the next route (and then the fallback chain); the X-Provider/X-Model headers reflect whichever route ultimately succeeded.
How the Routing Service Works
- Load config — Finds the routing config matching the tenant, capability, and enabled state.
- Load providers — Fetches all
ProviderConfigdocuments referenced in the routes. - Filter — Removes providers that are disabled or not in the tenant’s allowed-providers list. Unhealthy providers are not removed — they are demoted to a last-resort pool.
- Apply strategy — Orders the remaining candidates using the selected strategy.
- Append fallbacks — Adds fallback chain entries after the strategy-ordered list.
- Cache — Results are cached in memory for 60 seconds with jittered TTL to prevent thundering herd.
Next Steps
- Explore the Routing Strategies architecture doc for details on all ten strategies
- Configure Budget Management to control costs across providers