Skip to content

Routing Configuration

Routing configs determine how the gateway distributes incoming requests across providers and models. Each config targets one or more capabilities and applies a strategy to select a provider/model route for each request.

Route List

The Routing page displays all configured routing configs as cards. Each card shows:

  • Name
  • Strategy — the algorithm used to select a route
  • Capabilities — which request types this config handles
  • Routes count — how many provider/model routes are assigned
  • Status — enabled or disabled
  • routing:<slug> hint — if a slug is set, the address clients can use to invoke this config explicitly
  • Provider weights (for the weighted strategy)
  • Fallback chain (if configured)

Creating a Routing Config

Click Create Route to open the form with these fields:

FieldDescription
NameA descriptive name for the config.
SlugOptional lowercase-hyphenated identifier. When set, clients can target this config directly with the model string routing:<slug> (see below).
CapabilitiesOne or more request types this config applies to (multi-select, at least one).
StrategyThe routing algorithm (see below).
RoutesOne or more provider/model entries. Each has a provider config, a model, a priority (lower runs first), a weight (for weighted), and an enabled flag. At least one route is required.
Fallback ChainOptional ordered list of provider/model entries to try if the selected route fails.
Local FallbackOptional last-resort provider/model (e.g. a self-hosted model) used when every other route is exhausted.
Retry PolicyOptional max retries, initial/max delay, and backoff multiplier.
EnabledToggle to activate or deactivate the config.

Routing Strategies

StrategyValueBehavior
PrioritypriorityRoutes by the priority number on each route (lowest first).
Round Robinround-robinDistributes requests evenly across routes in rotation (Redis-backed counter).
WeightedweightedDistributes requests proportionally based on each route’s weight.
Least Costleast-costSelects the route with the lowest per-token cost for the requested model.
Least Latencyleast-latencySelects the route with the lowest recent average latency.
Free Tier Firstfree-tier-firstPrefers routes with remaining free-tier quota before falling back to paid.
Task Optimizedtask-optimizedUses the model-intelligence service to pick the best model for the prompt/task.
Cost Optimizedcost-optimizedAlias of Least Cost — selects the cheapest route by per-token pricing.
FailoverfailoverPrefers healthy routes; demotes degraded/unhealthy ones to last resort.
RandomrandomSelects a route at random from the available pool.

Capabilities

A routing config applies to one or more capabilities (it is not limited to a single capability). The available values are:

  • chat — Chat completions
  • completions — Text completions
  • embeddings — Vector embeddings
  • audio — Audio transcription and translation
  • images — Image generation
  • tts — Speech synthesis
  • rerank — Reranking
  • video-generation — Video generation

You can create multiple configs covering the same capability; the gateway resolves the applicable config for the request.

Explicit routing with routing:<slug>

If a config has a slug, clients can invoke it directly by setting the request model to routing:<slug> (for example, routing:cheap-chat). The gateway then resolves providers/models from that config’s routes instead of looking up a config by model name. The config’s capabilities must include the capability of the endpoint being called, or the request is rejected with GATEWAY_ROUTING_CONFIG_MISMATCH.

Fallback Chains

Every config supports an optional fallback chain — an ordered list of provider + model entries (added via Add Fallback rows in the form, not a comma-separated string). When the selected route returns an error or is unhealthy, the gateway tries each fallback entry in order. Circular references are detected and broken.

Fallback chains work with any strategy. For example, a least-latency config can still fall back to a manually specified chain if all preferred routes are down.

Weighted Distribution

When using the weighted strategy, each route carries a numeric weight and the gateway distributes traffic proportionally. For example, weights of 3 and 1 on two routes send roughly 75% and 25% of traffic respectively.

Editing and Deleting Configs

  • Click Edit on any card to modify its settings.
  • Click Delete to remove a config. A confirmation dialog warns that traffic using it will fall back to default routing.
  • Use the Enabled toggle to temporarily disable a config without deleting it.

Default Routing Strategy

A system-wide default routing strategy is configured on the Settings page and applies when no specific routing config matches a request. Routing configs override the default for the capabilities they cover.