Routing Configuration
Routing configs determine how the gateway distributes incoming requests across providers and models. Each config targets one or more capabilities and applies a strategy to select a provider/model route for each request.
Route List
The Routing page displays all configured routing configs as cards. Each card shows:
- Name
- Strategy — the algorithm used to select a route
- Capabilities — which request types this config handles
- Routes count — how many provider/model routes are assigned
- Status — enabled or disabled
routing:<slug>hint — if a slug is set, the address clients can use to invoke this config explicitly- Provider weights (for the weighted strategy)
- Fallback chain (if configured)
Creating a Routing Config
Click Create Route to open the form with these fields:
| Field | Description |
|---|---|
| Name | A descriptive name for the config. |
| Slug | Optional lowercase-hyphenated identifier. When set, clients can target this config directly with the model string routing:<slug> (see below). |
| Capabilities | One or more request types this config applies to (multi-select, at least one). |
| Strategy | The routing algorithm (see below). |
| Routes | One or more provider/model entries. Each has a provider config, a model, a priority (lower runs first), a weight (for weighted), and an enabled flag. At least one route is required. |
| Fallback Chain | Optional ordered list of provider/model entries to try if the selected route fails. |
| Local Fallback | Optional last-resort provider/model (e.g. a self-hosted model) used when every other route is exhausted. |
| Retry Policy | Optional max retries, initial/max delay, and backoff multiplier. |
| Enabled | Toggle to activate or deactivate the config. |
Routing Strategies
| Strategy | Value | Behavior |
|---|---|---|
| Priority | priority | Routes by the priority number on each route (lowest first). |
| Round Robin | round-robin | Distributes requests evenly across routes in rotation (Redis-backed counter). |
| Weighted | weighted | Distributes requests proportionally based on each route’s weight. |
| Least Cost | least-cost | Selects the route with the lowest per-token cost for the requested model. |
| Least Latency | least-latency | Selects the route with the lowest recent average latency. |
| Free Tier First | free-tier-first | Prefers routes with remaining free-tier quota before falling back to paid. |
| Task Optimized | task-optimized | Uses the model-intelligence service to pick the best model for the prompt/task. |
| Cost Optimized | cost-optimized | Alias of Least Cost — selects the cheapest route by per-token pricing. |
| Failover | failover | Prefers healthy routes; demotes degraded/unhealthy ones to last resort. |
| Random | random | Selects a route at random from the available pool. |
Capabilities
A routing config applies to one or more capabilities (it is not limited to a single capability). The available values are:
chat— Chat completionscompletions— Text completionsembeddings— Vector embeddingsaudio— Audio transcription and translationimages— Image generationtts— Speech synthesisrerank— Rerankingvideo-generation— Video generation
You can create multiple configs covering the same capability; the gateway resolves the applicable config for the request.
Explicit routing with routing:<slug>
If a config has a slug, clients can invoke it directly by setting the request model to routing:<slug> (for example, routing:cheap-chat). The gateway then resolves providers/models from that config’s routes instead of looking up a config by model name. The config’s capabilities must include the capability of the endpoint being called, or the request is rejected with GATEWAY_ROUTING_CONFIG_MISMATCH.
Fallback Chains
Every config supports an optional fallback chain — an ordered list of provider + model entries (added via Add Fallback rows in the form, not a comma-separated string). When the selected route returns an error or is unhealthy, the gateway tries each fallback entry in order. Circular references are detected and broken.
Fallback chains work with any strategy. For example, a least-latency config can still fall back to a manually specified chain if all preferred routes are down.
Weighted Distribution
When using the weighted strategy, each route carries a numeric weight and the gateway distributes traffic proportionally. For example, weights of 3 and 1 on two routes send roughly 75% and 25% of traffic respectively.
Editing and Deleting Configs
- Click Edit on any card to modify its settings.
- Click Delete to remove a config. A confirmation dialog warns that traffic using it will fall back to default routing.
- Use the Enabled toggle to temporarily disable a config without deleting it.
Default Routing Strategy
A system-wide default routing strategy is configured on the Settings page and applies when no specific routing config matches a request. Routing configs override the default for the capabilities they cover.