Kubernetes Deployment
Gatewyse can be deployed on Kubernetes for production workloads. The repository ships reference manifests in the k8s/ directory for every workload — copy them into your cluster manifest repo, replace the placeholder secrets, and apply.
Reference Manifests
The repo includes the following manifests under k8s/:
| File | Purpose |
|---|---|
namespace.yaml | Creates the ai-gateway namespace |
secrets.yaml | Secret template with placeholder JWT / encryption / DB credentials |
mongo-statefulset.yaml | MongoDB replica-set StatefulSet (in-cluster MongoDB) |
redis-deployment.yaml | Redis Deployment with persistent volume |
server-deployment.yaml | Gateway API Deployment + Service |
worker-deployment.yaml | BullMQ background worker Deployment |
admin-deployment.yaml | Admin dashboard Deployment |
docs-deployment.yaml | Astro Starlight docs site Deployment |
website-deployment.yaml | Marketing website Deployment |
network-policies.yaml | NetworkPolicies restricting pod-to-pod traffic |
hpa.yaml | HorizontalPodAutoscaler for the server |
ingress.yaml | Ingress routing for all public endpoints |
There is no ConfigMap manifest — non-sensitive config is wired inline in each Deployment’s env: block, and the rest is held in aigw-secrets so the Secret stays the single source of truth.
Architecture
A typical Kubernetes deployment includes:
| Resource | Purpose |
|---|---|
| Deployment (server) | Gateway API pods |
| Deployment (worker) | BullMQ background worker pods |
| Deployment (admin) | Admin dashboard pods |
| Deployment (docs / website) | Static-site frontends |
| StatefulSet (mongodb) | MongoDB replica-set instances |
| Deployment (redis) | Redis cache + queue backing store |
| Service | Internal and external networking |
| Secret | JWT secrets, encryption keys, license JWT, database credentials |
| NetworkPolicy | Restricts pod-to-pod traffic to the minimum needed |
| HPA | Horizontal Pod Autoscaler for the server |
| Ingress | External traffic routing |
Secret
Store sensitive values in a Kubernetes Secret. LICENSE_TOKEN and LICENSE_PUBLIC_KEYS are required in production — without them the server logs a fatal license error and exits with process.exit(1) at boot:
apiVersion: v1kind: Secretmetadata: name: aigw-secrets namespace: ai-gatewaytype: OpaquestringData: JWT_SECRET: "<random-64-char-string>" JWT_REFRESH_SECRET: "<random-64-char-string>" ENCRYPTION_KEY: "<random-64-hex-chars>" REDIS_PASSWORD: "<redis-password>" SUPER_ADMIN_PASSWORD: "<complex-password>" MONGODB_URI: "mongodb://aigw-mongo-0.aigw-mongo:27017/ai-gateway?replicaSet=rs0" # License (EE) — required in production. PEM-encoded public keys are # concatenated with `;;` (newlines inside env values are unreliable across # shells / Docker). LICENSE_TOKEN: "<ed25519-signed-jwt-from-platform>" LICENSE_PUBLIC_KEYS: "<pem-key-1>;;<pem-key-2>"Server Deployment
apiVersion: apps/v1kind: Deploymentmetadata: name: aigw-server namespace: ai-gatewayspec: replicas: 2 selector: matchLabels: app: aigw-server template: metadata: labels: app: aigw-server spec: containers: - name: server image: ai-gateway/server:latest ports: - containerPort: 3000 envFrom: - secretRef: name: aigw-secrets resources: requests: cpu: 500m memory: 512Mi limits: cpu: "2" memory: 1Gi livenessProbe: # /health is the liveness probe — it returns 200 as long as the # Express event loop is responsive. It does NOT verify Mongo/Redis. httpGet: path: /health port: 3000 initialDelaySeconds: 15 periodSeconds: 10 readinessProbe: # /ready is the readiness probe — it returns 503 when Mongo or # Redis are unreachable, so the pod is removed from the Service # endpoint set until infra recovers. httpGet: path: /ready port: 3000 initialDelaySeconds: 5 periodSeconds: 5Worker Deployment
apiVersion: apps/v1kind: Deploymentmetadata: name: aigw-worker namespace: ai-gatewayspec: replicas: 2 selector: matchLabels: app: aigw-worker template: metadata: labels: app: aigw-worker spec: containers: - name: worker image: ai-gateway/worker:latest envFrom: - secretRef: name: aigw-secrets resources: requests: cpu: 250m memory: 256Mi limits: cpu: "1" memory: 512MiService and Ingress
apiVersion: v1kind: Servicemetadata: name: aigw-server namespace: ai-gatewayspec: selector: app: aigw-server ports: - port: 80 targetPort: 3000 type: ClusterIP---apiVersion: networking.k8s.io/v1kind: Ingressmetadata: name: aigw-ingress namespace: ai-gatewayspec: rules: - host: gateway.your-domain.com http: paths: - path: / pathType: Prefix backend: service: name: aigw-server port: number: 80Horizontal Pod Autoscaler
apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata: name: aigw-server-hpa namespace: ai-gatewayspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: aigw-server minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70Resource Recommendations
| Component | CPU Request | CPU Limit | Memory Request | Memory Limit |
|---|---|---|---|---|
| Server | 500m | 2 | 512Mi | 1Gi |
| Worker | 250m | 1 | 256Mi | 512Mi |
| Admin | 100m | 500m | 128Mi | 256Mi |
Infrastructure Dependencies
MongoDB and Redis can be deployed in-cluster using the reference manifests, or pointed at a managed service via MONGODB_URI / REDIS_HOST.
In-Cluster (Reference Manifests)
The repo ships:
k8s/mongo-statefulset.yaml— single-replica MongoDB StatefulSet with persistent volume claim and replica-set initialization. Suitable for small/medium deployments. For larger workloads, scale up the StatefulSet or swap in the MongoDB Community Operator.k8s/redis-deployment.yaml— Redis Deployment configured with--maxmemory-policy noeviction(required by BullMQ —allkeys-lrucauses silent job loss) and AOF persistence.
Managed Services
Override MONGODB_URI / REDIS_HOST in aigw-secrets to point at:
- MongoDB: Atlas, DocumentDB, or another managed replica-set service.
- Redis: ElastiCache, Memorystore, or another managed Redis (verify the eviction policy is
noeviction).
When using managed services, you can omit mongo-statefulset.yaml and redis-deployment.yaml from the apply set.