Configure Guards

Guards are a pipeline of checks that run on every request (and optionally every response) before the request reaches a provider. This tutorial walks you through configuring a PII detection guard.

Prerequisites

A running Gatewyse instance with at least one provider configured
Admin dashboard access

Guard Types

The gateway supports eleven guard types (use the value, not the label, when scripting):

Type (value)	Purpose
`pii-detection`	Detects emails, phone numbers, SSNs, credit cards, IP addresses
`prompt-injection`	Scans for prompt injection patterns (e.g., “ignore previous instructions”)
`toxicity`	Heuristic toxicity detection scored against a threshold
`topic-filter`	Blocks content matching disallowed topics
`regex-filter`	Matches content against custom regular expressions
`keyword-filter`	Blocks requests containing configurable keywords
`token-limit`	Blocks requests exceeding an estimated token count
`rate-limit`	Enforces a request rate beyond global limits
`cost-limit`	Blocks requests exceeding an estimated cost threshold
`model-guard`	Restricts which models a request may target
`custom`	User-defined guard logic

Step 1 — Navigate to the Guards Page

In the sidebar, click Guards.
The guards page shows the current guard configuration for your tenant.
Click Create Guard Config if no configuration exists, or Edit to modify an existing one.

Step 2 — Add a PII Detection Guard

Click Add Guard within the configuration.
Set Type to pii-detection.
Set Level to one of:
- block — Reject the request entirely and return an error.
- warn — Allow the request but log a warning. The warning appears in audit logs.
- monitor — Log silently for analytics. No user-visible effect.
- off — Disable this guard.
Set Apply To:
- request — Only scan incoming prompts.
- response — Only scan provider responses.
- both — Scan in both directions.
Set Priority — Lower numbers run first. If you have multiple guards, set PII detection to 10 so it runs early.

Step 3 — Configure PII Sensitivity

The PII detector scans for five pattern types by default:

Email addresses — Standard email format
Phone numbers — US phone formats with optional country code
Social Security Numbers — XXX-XX-XXXX pattern
Credit card numbers — 16-digit patterns with optional separators
IP addresses — IPv4 dotted-decimal format

Confidence is calculated as min(detections.length * 0.2 + 0.3, 1.0). A single PII match (≥ 0.5 confidence) triggers the configured action. Credit-card matches are additionally Luhn-validated to reduce false positives. Matched values can be redacted, masked, hashed, or replaced.

Step 4 — Add an Injection Detection Guard

Click Add Guard again.
Set Type to prompt-injection.
Set Level to block.
Set Apply To to request.
Set Priority to 5 (runs before PII detection).

The injection detector checks for patterns such as:

“ignore all previous instructions”
“you are now a…”
“new system instructions:”
“override system/safety”
INST/im_start tokens
DAN mode / developer mode references

Each matched pattern contributes a weight; the confidence score is the sum of matched-pattern weights, capped at 1.0. The default block threshold is 0.7.

Step 5 — Save and Test

Click Save to persist the guard configuration.
Test with a request containing PII:

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "My SSN is 123-45-6789"}]
  }'

With the PII guard set to block, the request is rejected with HTTP 422 and the canonical error envelope:

{
  "error": {
    "code": "GUARD_PII_DETECTED",
    "message": "Detected 1 PII item(s): ssn",
    "details": { "count": 1 }
  }
}

Each guard type maps to its own error code — GUARD_PII_DETECTED, GUARD_INJECTION_DETECTED, GUARD_TOXICITY_DETECTED, GUARD_CONTENT_FILTERED (keyword/topic/regex/model), GUARD_TOKEN_LIMIT, GUARD_COST_LIMIT, or GUARD_CUSTOM_RULE. With the guard set to warn, the request proceeds but a warning is logged in the audit trail.

How the Guard Pipeline Works

Load config — The PromptGuardService loads the tenant’s guard configuration from Redis cache (10-minute TTL) or falls back to MongoDB.
Filter — Guards are filtered by direction (request or response) and removed if level is off.
Sort — Guards execute in priority order (ascending).
Execute — Each guard runs against the request content and produces a GuardResult with an action (pass, warn, monitor, or block) and confidence score.
Short-circuit — If any guard returns block, the pipeline stops immediately and returns the blocked response.
Content modification — Some guards can modify content (e.g., redacting PII). Modified content flows to subsequent guards.

Next Steps

Add a keyword-filter guard with custom blocked keywords for your organization’s compliance requirements
Configure a regex-filter or custom guard with patterns specific to your data governance policies
Review guard activity in the Audit Logs section of the dashboard