AI Gateway Service — Deployment Topology
Status: populated
Owner: TBD
Last updated: 2026-04-17
Companion: Service Template · 17 technology-stack
1. Runtime
| Item | Detail |
|---|
| Runtime | Node.js 22 LTS |
| Framework | NestJS 11 (Fastify) |
| Language | TypeScript 5.x strict |
| ORM | Drizzle ORM (Postgres 16) |
| Messaging | NATS JetStream 2.10+ |
| Cache / quota | Redis 7+ (cluster in prod) |
| Gateway | Kong (edge) with /api/v1/ai/* route |
2. Replica & scaling
| Env | Replicas | CPU | Memory | HPA |
|---|
| dev | 1 | 500m | 512Mi | off |
| staging | 2 | 1 vCPU | 1Gi | on (CPU 70%) |
| prod | 4–12 | 2 vCPU | 2Gi | HPA on RPS + queue depth |
Separate deployment for HITL reviewer API uses same image with a role flag; isolated so admin UIs don't contend with assist traffic. Provider adapters run in-process; in future we may extract heavy on-prem provider clients into sidecars.
3. Regions
af-kbl-1 (on-prem Afghanistan) — mandatory for AF tenants.
eu-fra-1, me-uae-1 — cloud regions for non-AF tenants.
- Data residency honoured by routing rule; on-prem provider required for AF.
4. Dependencies
| Dependency | Required |
|---|
| identity-service (access-policy) | Yes |
| config-service | Yes |
| audit-service | Yes (async; fire-and-forget safe) |
| communication-service | Yes (for HITL notifications) |
| Postgres 16 | Yes |
| NATS JetStream | Yes |
| Redis 7+ | Yes |
| External provider (per routing rule) | At least one live |
5. Config
| Env var | Default | Purpose |
|---|
PORT | 3040 | HTTP listen |
DATABASE_URL | — | Postgres |
NATS_URL | — | JetStream |
REDIS_URL | — | quota + cache |
ACCESS_POLICY_URL | — | Access policy internal URL |
CONFIG_SERVICE_URL | — | Config resolver |
AUDIT_SERVICE_URL | — | Audit ingest hint |
AI_POLICY_TIMEOUT_MS | 3000 | Policy deadline |
AI_MODERATION_TIMEOUT_MS | 1500 | Per classifier |
AI_PROVIDER_TIMEOUT_MS | 30000 | Per provider call |
AI_DEFAULT_QUOTA_PER_MINUTE | 120 | Fallback tenant quota |
OTEL_EXPORTER_OTLP_ENDPOINT | — | telemetry |
PROVIDER_*_KEY | — | secrets sourced from KMS/Vault |
6. Canary & rollback
- 5% canary for 30 min; promote on SLO green.
- Kong stable/canary routes support shadow traffic for provider comparison.
- Full rollback uses prior image + config snapshot; no schema change needed for patch releases.