Architecture Overview

Gatewyse is an enterprise multi-tenant system that sits between your applications and AI providers. It provides unified API access, intelligent routing, security guards, budget enforcement, and observability across 38 AI providers.

Monorepo Structure

ai-gateway/
+-- packages/
|   +-- shared/       # Types, Zod schemas, utilities (consumed by all packages)
|   +-- license/      # Ed25519 license JWT verifier (EE distribution)
|   +-- server/       # Express 5 API server (gateway + admin API)
|   +-- worker/       # BullMQ background job processor
|   +-- admin/        # Nuxt 4 SPA dashboard (PrimeVue 4)
|   |   +-- e2e/      # Playwright E2E tests for the admin dashboard (~174 tests)
|   +-- content/      # Shared content assets
|   +-- docs/         # Astro Starlight documentation site
|   +-- website/      # Astro marketing site
+-- docker/           # Docker Compose / Swarm stacks, Dockerfiles, nginx
+-- k8s/              # Kubernetes manifests
+-- tests/
|   +-- unit/         # Jest unit tests (~854 tests)
|   +-- integration/  # Jest integration tests with Docker MongoDB/Redis (46 tests)
|   +-- smoke/        # Live server smoke tests (6 phases)

Key Technologies

Component	Technology
Runtime	Node.js 24+
API Server	Express 5
Admin Dashboard	Nuxt 4, PrimeVue 4, Pinia
Database	MongoDB (Mongoose ODM)
Cache / Queue	Redis (ioredis)
Job Queue	BullMQ
Package Manager	pnpm 10+ (workspaces)
Auth	JWT + RBAC + SSO (OIDC/SAML)

Request Lifecycle

Every gateway request follows this pipeline:

Client Request
      |
      v
[1] Authentication        -- Validate API key or JWT
      |
      v
[2] Tenant Resolver       -- Hydrate req.tenant for downstream middleware
      |
      v
[3] Capability Guard      -- Enforce per-key capability scoping
      |
      v
[4] Rate Limiting         -- Per-API-key and per-tenant sliding window
      |
      v
[5] Validation            -- Zod schema check on request body
      |
      v
[6] Format Detection      -- Identify OpenAI vs Anthropic request shape
      |
      v
[7] Normalizer            -- Translate to UnifiedRequest internal format
      |
      v
[8] RAG Injection         -- Optionally inject retrieved context
      |
      v
[9] Prompt Guards         -- PII detection, injection defense, content filter,
      |                       toxicity scoring, token/cost limits, custom rules
      |  (blocked? return error)
      v
[10] Budget Check         -- Verify spending limits (tenant/org/dept/user)
      |
      v
[11] Semantic Cache       -- Lookup similar responses; short-circuit on hit
      |  (cache hit? return cached response)
      v
[12] Usage Tracking       -- Wire post-response tracking + audit logging
      |
      v
[13] Routing              -- Resolve provider chain using configured strategy
      |                       (one of 10 strategies, see Routing Strategies)
      v
[14] Provider Execution   -- Adapter translates to provider format, sends request
      |                       with retry/backoff on transient failures
      |  (failure? try next provider in chain)
      v
[15] Response Formatting  -- Adapter translates provider response to unified format
      |
      v
[16] Cache Write          -- Store response in semantic cache
      |
      v
[17] Usage + Audit        -- Record tokens, cost, latency + immutable audit log
      |
      v
Client Response

The order above matches packages/server/src/routes/gateway/chat.routes.ts. Other gateway routes use the same pipeline with their own validation schema and capability scope.

Data Flow Between Services

+------------+       +----------+       +-----------+
|   Client   | ----> |  Server  | ----> | Provider  |
| (your app) | <---- | (Express)| <---- | (OpenAI,  |
+------------+       +----+-----+       | Anthropic,|
                          |             | etc.)     |
                          v             +-----------+
                    +-----+------+
                    |  MongoDB   |  Configs, users, tenants,
                    |            |  audit logs, usage records
                    +-----+------+
                          |
                    +-----+------+
                    |   Redis    |  Cache, rate limits, routing
                    |            |  counters, session state, queues
                    +-----+------+
                          |
                    +-----+------+
                    |   Worker   |  Budget resets, SIEM export,
                    | (BullMQ)   |  backup jobs, health checks
                    +------------+

Admin Dashboard

The admin dashboard is a Nuxt 4 single-page application (SPA mode, ssr: false) that communicates with the server’s /api/admin/* endpoints. It provides management interfaces for:

Tenants, users, and RBAC role management
Provider and model configuration
Routing rule management
Guard configuration
Budget creation and monitoring
Usage analytics and audit logs
Semantic cache management
System settings and backups

API Surface

The server exposes two groups of endpoints:

Gateway API (/v1/*) — OpenAI-compatible endpoints consumed by applications:

/v1/chat/completions — Chat completions (streaming and non-streaming)
/v1/completions — Text completions
/v1/embeddings — Vector embeddings
/v1/audio/transcriptions, /v1/audio/translations, /v1/audio/speech — Audio
/v1/images/generations — Image generation
/v1/rerank — Document reranking
/v1/video/generations — Video generation
/v1/models — List available models
/v1/usage, /v1/budget — API key self-service

Admin API (/api/admin/*) — Dashboard management endpoints for tenants, users, providers, models, routing, guards, budgets, usage, audit logs, cache, settings, documents, backups, and model intelligence.

Next Steps

Routing Strategies — Deep dive into the ten routing algorithms
Provider Adapters — How adapters translate between the unified API and 38 providers
Contributing — Development setup and testing