Skip to content

Kubernetes Deployment

Gatewyse can be deployed on Kubernetes for production workloads. The repository ships reference manifests in the k8s/ directory for every workload — copy them into your cluster manifest repo, replace the placeholder secrets, and apply.

Reference Manifests

The repo includes the following manifests under k8s/:

FilePurpose
namespace.yamlCreates the ai-gateway namespace
secrets.yamlSecret template with placeholder JWT / encryption / DB credentials
mongo-statefulset.yamlMongoDB replica-set StatefulSet (in-cluster MongoDB)
redis-deployment.yamlRedis Deployment with persistent volume
server-deployment.yamlGateway API Deployment + Service
worker-deployment.yamlBullMQ background worker Deployment
admin-deployment.yamlAdmin dashboard Deployment
docs-deployment.yamlAstro Starlight docs site Deployment
website-deployment.yamlMarketing website Deployment
network-policies.yamlNetworkPolicies restricting pod-to-pod traffic
hpa.yamlHorizontalPodAutoscaler for the server
ingress.yamlIngress routing for all public endpoints

There is no ConfigMap manifest — non-sensitive config is wired inline in each Deployment’s env: block, and the rest is held in aigw-secrets so the Secret stays the single source of truth.

Architecture

A typical Kubernetes deployment includes:

ResourcePurpose
Deployment (server)Gateway API pods
Deployment (worker)BullMQ background worker pods
Deployment (admin)Admin dashboard pods
Deployment (docs / website)Static-site frontends
StatefulSet (mongodb)MongoDB replica-set instances
Deployment (redis)Redis cache + queue backing store
ServiceInternal and external networking
SecretJWT secrets, encryption keys, license JWT, database credentials
NetworkPolicyRestricts pod-to-pod traffic to the minimum needed
HPAHorizontal Pod Autoscaler for the server
IngressExternal traffic routing

Secret

Store sensitive values in a Kubernetes Secret. LICENSE_TOKEN and LICENSE_PUBLIC_KEYS are required in production — without them the server logs a fatal license error and exits with process.exit(1) at boot:

apiVersion: v1
kind: Secret
metadata:
name: aigw-secrets
namespace: ai-gateway
type: Opaque
stringData:
JWT_SECRET: "<random-64-char-string>"
JWT_REFRESH_SECRET: "<random-64-char-string>"
ENCRYPTION_KEY: "<random-64-hex-chars>"
REDIS_PASSWORD: "<redis-password>"
SUPER_ADMIN_PASSWORD: "<complex-password>"
MONGODB_URI: "mongodb://aigw-mongo-0.aigw-mongo:27017/ai-gateway?replicaSet=rs0"
# License (EE) — required in production. PEM-encoded public keys are
# concatenated with `;;` (newlines inside env values are unreliable across
# shells / Docker).
LICENSE_TOKEN: "<ed25519-signed-jwt-from-platform>"
LICENSE_PUBLIC_KEYS: "<pem-key-1>;;<pem-key-2>"

Server Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
name: aigw-server
namespace: ai-gateway
spec:
replicas: 2
selector:
matchLabels:
app: aigw-server
template:
metadata:
labels:
app: aigw-server
spec:
containers:
- name: server
image: ai-gateway/server:latest
ports:
- containerPort: 3000
envFrom:
- secretRef:
name: aigw-secrets
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: "2"
memory: 1Gi
livenessProbe:
# /health is the liveness probe — it returns 200 as long as the
# Express event loop is responsive. It does NOT verify Mongo/Redis.
httpGet:
path: /health
port: 3000
initialDelaySeconds: 15
periodSeconds: 10
readinessProbe:
# /ready is the readiness probe — it returns 503 when Mongo or
# Redis are unreachable, so the pod is removed from the Service
# endpoint set until infra recovers.
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5

Worker Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
name: aigw-worker
namespace: ai-gateway
spec:
replicas: 2
selector:
matchLabels:
app: aigw-worker
template:
metadata:
labels:
app: aigw-worker
spec:
containers:
- name: worker
image: ai-gateway/worker:latest
envFrom:
- secretRef:
name: aigw-secrets
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: "1"
memory: 512Mi

Service and Ingress

apiVersion: v1
kind: Service
metadata:
name: aigw-server
namespace: ai-gateway
spec:
selector:
app: aigw-server
ports:
- port: 80
targetPort: 3000
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: aigw-ingress
namespace: ai-gateway
spec:
rules:
- host: gateway.your-domain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: aigw-server
port:
number: 80

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: aigw-server-hpa
namespace: ai-gateway
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: aigw-server
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

Resource Recommendations

ComponentCPU RequestCPU LimitMemory RequestMemory Limit
Server500m2512Mi1Gi
Worker250m1256Mi512Mi
Admin100m500m128Mi256Mi

Infrastructure Dependencies

MongoDB and Redis can be deployed in-cluster using the reference manifests, or pointed at a managed service via MONGODB_URI / REDIS_HOST.

In-Cluster (Reference Manifests)

The repo ships:

  • k8s/mongo-statefulset.yaml — single-replica MongoDB StatefulSet with persistent volume claim and replica-set initialization. Suitable for small/medium deployments. For larger workloads, scale up the StatefulSet or swap in the MongoDB Community Operator.
  • k8s/redis-deployment.yaml — Redis Deployment configured with --maxmemory-policy noeviction (required by BullMQ — allkeys-lru causes silent job loss) and AOF persistence.

Managed Services

Override MONGODB_URI / REDIS_HOST in aigw-secrets to point at:

  • MongoDB: Atlas, DocumentDB, or another managed replica-set service.
  • Redis: ElastiCache, Memorystore, or another managed Redis (verify the eviction policy is noeviction).

When using managed services, you can omit mongo-statefulset.yaml and redis-deployment.yaml from the apply set.