Skip to content

Guards

Guards are protective rules that inspect requests and responses flowing through the gateway. They can detect sensitive data, block harmful content, and enforce usage limits before requests reach providers or before responses reach clients.

Guards are organized into guard configs. A guard config has a name and contains one or more guard entries, each with its own type, enforcement level, direction, and priority. The gateway evaluates a config’s guard entries in ascending priority order (lower numbers run first).

Guard List

The Guards page displays each guard config as a card showing its name, an enable/disable toggle, and a chip for every guard entry it contains (each chip shows the entry’s type and enforcement level). Use the toggle to enable or disable an entire config inline.

Guard Templates

Click Templates to apply a pre-built guard config as a starting point. Built-in templates include toxicity blocking, PII blocking, prompt-injection blocking, and a combined “production safety” config that layers all three. Applying a template populates the create form, which you can then customize before saving.

Creating a Guard Config

Click Create Guard to open the form:

FieldDescription
NameA descriptive name for the config (e.g. “Production Safety Guards”).
GuardsOne or more guard entries. Use Add Guard to append entries and the remove control to delete them.

Each guard entry has:

FieldDescription
TypeThe detection or enforcement mechanism (see below).
LevelHow the gateway responds when the guard triggers.
Applied ToWhether to inspect request, response, or both.
PriorityEvaluation order within the config; lower numbers run first.
ConfigType-specific options (e.g. toxicity threshold, PII patterns, keyword list).

Guard Types

TypeValueDescription
PII Detectionpii-detectionScans for personally identifiable information patterns.
ToxicitytoxicityScores content for toxic/harmful language against a configurable threshold.
Prompt Injectionprompt-injectionDetects attempts to manipulate model behavior through crafted prompts.
Topic Filtertopic-filterBlocks content matching disallowed topics.
Regex Filterregex-filterMatches content against custom regular expressions.
Keyword Filterkeyword-filterBlocks content containing configured keywords/phrases.
Token Limittoken-limitCaps the maximum token count per request.
Rate Limitrate-limitEnforces request rate limits beyond the global settings.
Cost Limitcost-limitBlocks requests whose estimated cost exceeds a limit.
Model Guardmodel-guardRestricts which models the request may target.
CustomcustomUser-defined guard logic.

Enforcement Levels

LevelValueBehavior
OffoffThe guard is present but not evaluated.
MonitormonitorRecords detections in audit logs without affecting the request.
WarnwarnAllows the request to proceed but logs a warning and may trigger alerts.
BlockblockRejects the request or suppresses the response with a 422 error.

Type-Specific Configuration

Toxicity

A Toxicity Threshold slider (0 = most permissive, 1 = most strict; the UI defaults to 0.70) appears for the toxicity type. Content scoring at or above the threshold triggers the guard.

PII Detection

The PII detector scans for the following pattern types. If you do not restrict the set, all patterns are scanned:

  • email — Email addresses
  • ssn — US Social Security numbers
  • credit_card — Credit card numbers (Luhn-validated)
  • phone — Phone numbers
  • ip_address — IPv4 addresses

Create multiple PII guard entries with different levels if you want to block some patterns and only warn on others. Matched values can be redacted, masked, hashed, or replaced.

Keyword / Regex / Topic Filters

Provide the keyword list, regular expression(s), or disallowed topics in the entry’s config. These run as substring/pattern matches against the inspected content.

Token / Cost / Rate / Model Guards

These enforcement guards take numeric limits (max tokens, max estimated cost, request rate) or an allowed-model set in their config and reject requests that exceed them.

Applied To: Request vs Response

TargetWhat is inspected
requestThe user’s prompt and any attached content before it reaches the provider.
responseThe provider’s response before it is returned to the client.
bothBoth directions are inspected.

Applying guards to the response is useful for catching provider output that contains sensitive data or inappropriate content even when the input was clean.

Editing and Deleting Guard Configs

  • Click Edit to modify a config and its guard entries.
  • Use the inline toggle switch to enable or disable a config without opening the editor.
  • Click Delete to permanently remove a config. A confirmation dialog warns that the protection it provides will be removed.

Best Practices

  • Start guard entries in monitor mode to understand detection patterns before switching to block.
  • Apply PII detection to both request and response to prevent data leakage in either direction.
  • Use toxicity and keyword filters on the response to catch unexpected provider output.
  • Layer multiple entries in one config — for example, a warn-level PII entry and a block-level prompt-injection entry, ordered by priority.