Skip to content

Budgets

Budgets let you set spending and usage limits at multiple levels of your organization hierarchy. Each budget defines per-period limits and an enforcement actionwarn (log and surface warnings) or block (reject further requests for that scope).

Budget List

The Budgets page displays a paginated table with these columns:

ColumnDescription
ScopeThe scope type and ID (e.g. “tenant / Acme Corp”).
LimitsThe configured cost / token / request limits.
Current UsageA progress bar showing consumption against the limit.
Actionwarn or block — what happens when a limit is reached.
StatusActive or Disabled, reflecting the budget’s enabled flag.

Creating a Budget

Click Create Budget to open the form.

Scope

Budgets can be scoped to any level in the multi-tenancy hierarchy. Select the Scope Type and the specific Scope ID it applies to:

Scope TypeValueDescription
TenanttenantApplies to an entire tenant and all its users.
OrganizationorganizationApplies to an organization within a tenant.
DepartmentdepartmentApplies to a department within an organization.
UseruserApplies to a single user.
API KeyapiKeyApplies to a specific API key.

Limits

Set one or more of the following limits. All are optional — configure only the ones relevant to your use case. The period each limit covers is encoded in the field name:

LimitDescription
dailyUsdMaximum dollar spend per day.
weeklyUsdMaximum dollar spend per week.
monthlyUsdMaximum dollar spend per month.
dailyTokensMaximum total tokens (input + output) per day.
monthlyTokensMaximum total tokens per month.
dailyRequestsMaximum number of requests per day.

There is no separate “period” selector — daily/weekly/monthly cadence is implied by the limit you set, and usage counters reset on their natural daily/weekly/monthly boundaries (UTC).

Alerts

Add one or more alert rows to be notified before a hard limit is hit. Each alert has:

FieldDescription
ThresholdPercentage of a limit (1–100) at which the alert fires.
Typewarning or critical.
Notify ViaOne or more channels: email, webhook, slack, in-app.

Action

The Action field controls enforcement when a limit is reached:

  • warn — requests are allowed through; a warning is logged and attached to the response.
  • block — requests for the scope are rejected.

Enabled

The Enabled toggle activates or pauses the budget without deleting it.

Automatic Enforcement

When a budget with action: block reaches a limit, the gateway rejects further requests for that scope with a 429 Too Many Requests carrying the BUDGET_EXCEEDED error code. Budgets with action: warn allow the request and attach warning information to the response instead.

Enforcement is hierarchical — if a tenant budget blocks, all users, departments, and organizations within that tenant are blocked, regardless of their individual budget state. Budget checks fail closed: if the budget service errors, the request is blocked for safety.

Editing and Deleting Budgets

  • Click Edit to modify limits, alerts, action, or the enabled flag. The scope and scope ID cannot be changed after creation — delete and recreate the budget to re-scope it.
  • Click Delete to remove a budget. A confirmation dialog warns that usage tracking for that scope will stop. Deletion does not affect historical usage data.

Best Practices

  • Set tenant-level budgets as overall spending caps, then use department or user budgets for finer-grained control.
  • Use warn action with alerts during rollout to observe spend before switching to block.
  • Add an alert at ~80% so teams can react before hitting a hard block limit.
  • Combine cost limits with token limits for defense in depth — a low-cost model can still consume excessive tokens.