Guards

Guards are protective rules that inspect requests and responses flowing through the gateway. They can detect sensitive data, block harmful content, and enforce usage limits before requests reach providers or before responses reach clients.

Guards are organized into guard configs. A guard config has a name and contains one or more guard entries, each with its own type, enforcement level, direction, and priority. The gateway evaluates a config’s guard entries in ascending priority order (lower numbers run first).

Guard List

The Guards page displays each guard config as a card showing its name, an enable/disable toggle, and a chip for every guard entry it contains (each chip shows the entry’s type and enforcement level). Use the toggle to enable or disable an entire config inline.

Guard Templates

Click Templates to apply a pre-built guard config as a starting point. Built-in templates include toxicity blocking, PII blocking, prompt-injection blocking, and a combined “production safety” config that layers all three. Applying a template populates the create form, which you can then customize before saving.

Creating a Guard Config

Click Create Guard to open the form:

Field	Description
Name	A descriptive name for the config (e.g. “Production Safety Guards”).
Guards	One or more guard entries. Use Add Guard to append entries and the remove control to delete them.

Each guard entry has:

Field	Description
Type	The detection or enforcement mechanism (see below).
Level	How the gateway responds when the guard triggers.
Applied To	Whether to inspect `request`, `response`, or `both`.
Priority	Evaluation order within the config; lower numbers run first.
Config	Type-specific options (e.g. toxicity threshold, PII patterns, keyword list).

Guard Types

Type	Value	Description
PII Detection	`pii-detection`	Scans for personally identifiable information patterns.
Toxicity	`toxicity`	Scores content for toxic/harmful language against a configurable threshold.
Prompt Injection	`prompt-injection`	Detects attempts to manipulate model behavior through crafted prompts.
Topic Filter	`topic-filter`	Blocks content matching disallowed topics.
Regex Filter	`regex-filter`	Matches content against custom regular expressions.
Keyword Filter	`keyword-filter`	Blocks content containing configured keywords/phrases.
Token Limit	`token-limit`	Caps the maximum token count per request.
Rate Limit	`rate-limit`	Enforces request rate limits beyond the global settings.
Cost Limit	`cost-limit`	Blocks requests whose estimated cost exceeds a limit.
Model Guard	`model-guard`	Restricts which models the request may target.
Custom	`custom`	User-defined guard logic.

Enforcement Levels

Level	Value	Behavior
Off	`off`	The guard is present but not evaluated.
Monitor	`monitor`	Records detections in audit logs without affecting the request.
Warn	`warn`	Allows the request to proceed but logs a warning and may trigger alerts.
Block	`block`	Rejects the request or suppresses the response with a `422` error.

Type-Specific Configuration

Toxicity

A Toxicity Threshold slider (0 = most permissive, 1 = most strict; the UI defaults to 0.70) appears for the toxicity type. Content scoring at or above the threshold triggers the guard.

PII Detection

The PII detector scans for the following pattern types. If you do not restrict the set, all patterns are scanned:

email — Email addresses
ssn — US Social Security numbers
credit_card — Credit card numbers (Luhn-validated)
phone — Phone numbers
ip_address — IPv4 addresses

Create multiple PII guard entries with different levels if you want to block some patterns and only warn on others. Matched values can be redacted, masked, hashed, or replaced.

Keyword / Regex / Topic Filters

Provide the keyword list, regular expression(s), or disallowed topics in the entry’s config. These run as substring/pattern matches against the inspected content.

Token / Cost / Rate / Model Guards

These enforcement guards take numeric limits (max tokens, max estimated cost, request rate) or an allowed-model set in their config and reject requests that exceed them.

Applied To: Request vs Response

Target	What is inspected
request	The user’s prompt and any attached content before it reaches the provider.
response	The provider’s response before it is returned to the client.
both	Both directions are inspected.

Applying guards to the response is useful for catching provider output that contains sensitive data or inappropriate content even when the input was clean.

Editing and Deleting Guard Configs

Click Edit to modify a config and its guard entries.
Use the inline toggle switch to enable or disable a config without opening the editor.
Click Delete to permanently remove a config. A confirmation dialog warns that the protection it provides will be removed.

Best Practices

Start guard entries in monitor mode to understand detection patterns before switching to block.
Apply PII detection to both request and response to prevent data leakage in either direction.
Use toxicity and keyword filters on the response to catch unexpected provider output.
Layer multiple entries in one config — for example, a warn-level PII entry and a block-level prompt-injection entry, ordered by priority.