Create Routing Rules

Routing rules determine which providers handle incoming requests and in what order. This tutorial walks you through creating a routing configuration using the priority strategy, then testing it with an API call.

Prerequisites

At least one provider configured and healthy (see Set Up Your First Provider)
Admin dashboard access

Step 1 — Navigate to the Routing Page

In the sidebar, click Routing.
The routing list shows all routing configurations for your tenant as a flat list of cards.
Click Create Route.

Step 2 — Set Basic Configuration

Name — Enter a descriptive name, for example Chat Primary Route.
Slug (optional) — A lowercase-hyphenated identifier. If set, clients can target this config directly with the model string routing:<slug>.
Capabilities — Select one or more request types this config handles (multi-select). Choose chat for chat completion requests.
Strategy — Select priority to start. This strategy orders routes by their priority number. See Routing Strategies for all ten options.
Enabled — Toggle on.

(A config can also be marked as the tenant default via the API field isDefault; there is no toggle for it in the create form.)

Step 3 — Add Provider Entries

Each route entry maps a provider to this routing configuration.

Click Add Route.
Provider — Select your provider from the dropdown (e.g., openai-prod).
Model ID — Enter the default model for this route, for example gpt-4o.
Priority — Enter 1 (lower number = higher priority). This provider will be tried first.
Weight — Enter 1. Weight is used by the weighted strategy; for priority strategy it has no effect.
Enabled — Toggle on.

To add a fallback provider:

Click Add Route again.
Select a second provider (e.g., anthropic-prod).
Set Priority to 2. This provider is tried only if the first fails.
Set Model ID to the equivalent model on this provider, for example claude-sonnet-4-20250514.

Step 4 — Configure the Fallback Chain (Optional)

The fallback chain provides a last-resort option after all route entries have been exhausted.

Scroll to the Fallback Chain section.
Add a provider entry with a model ID. This provider is appended after all strategy-ordered routes.
You can also enable a Local Fallback (e.g., an Ollama instance) for complete offline resilience.

The routing service detects circular fallback chains and breaks them automatically.

Step 5 — Ensure Models Are Registered

Model filtering happens automatically — there is no separate UI step. When a client requests a specific model (via the model field), the routing service only considers routes whose provider has a matching, enabled ModelConfig. Providers without the requested model are skipped.

To ensure correct filtering:

Verify each provider in your routes has the relevant models registered in the Models tab.
Models must be marked as enabled to be eligible.

Step 6 — Test with an API Call

Send a test request to verify routing works:

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

The response sets X-Provider and X-Model headers indicating which provider and model actually served the request:

X-Provider: openai
X-Model: gpt-4o

If the highest-priority route fails, the gateway transparently falls through to the next route (and then the fallback chain); the X-Provider/X-Model headers reflect whichever route ultimately succeeded.

How the Routing Service Works

Load config — Finds the routing config matching the tenant, capability, and enabled state.
Load providers — Fetches all ProviderConfig documents referenced in the routes.
Filter — Removes providers that are disabled or not in the tenant’s allowed-providers list. Unhealthy providers are not removed — they are demoted to a last-resort pool.
Apply strategy — Orders the remaining candidates using the selected strategy.
Append fallbacks — Adds fallback chain entries after the strategy-ordered list.
Cache — Results are cached in memory for 60 seconds with jittered TTL to prevent thundering herd.

Next Steps

Explore the Routing Strategies architecture doc for details on all ten strategies
Configure Budget Management to control costs across providers