Skip to main content

AI Gateway Service — Deployment Topology

Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · 17 technology-stack

1. Runtime

ItemDetail
RuntimeNode.js 22 LTS
FrameworkNestJS 11 (Fastify)
LanguageTypeScript 5.x strict
ORMDrizzle ORM (Postgres 16)
MessagingNATS JetStream 2.10+
Cache / quotaRedis 7+ (cluster in prod)
GatewayKong (edge) with /api/v1/ai/* route

2. Replica & scaling

EnvReplicasCPUMemoryHPA
dev1500m512Mioff
staging21 vCPU1Gion (CPU 70%)
prod4–122 vCPU2GiHPA on RPS + queue depth

Separate deployment for HITL reviewer API uses same image with a role flag; isolated so admin UIs don't contend with assist traffic. Provider adapters run in-process; in future we may extract heavy on-prem provider clients into sidecars.

3. Regions

  • af-kbl-1 (on-prem Afghanistan) — mandatory for AF tenants.
  • eu-fra-1, me-uae-1 — cloud regions for non-AF tenants.
  • Data residency honoured by routing rule; on-prem provider required for AF.

4. Dependencies

DependencyRequired
identity-service (access-policy)Yes
config-serviceYes
audit-serviceYes (async; fire-and-forget safe)
communication-serviceYes (for HITL notifications)
Postgres 16Yes
NATS JetStreamYes
Redis 7+Yes
External provider (per routing rule)At least one live

5. Config

Env varDefaultPurpose
PORT3040HTTP listen
DATABASE_URLPostgres
NATS_URLJetStream
REDIS_URLquota + cache
ACCESS_POLICY_URLAccess policy internal URL
CONFIG_SERVICE_URLConfig resolver
AUDIT_SERVICE_URLAudit ingest hint
AI_POLICY_TIMEOUT_MS3000Policy deadline
AI_MODERATION_TIMEOUT_MS1500Per classifier
AI_PROVIDER_TIMEOUT_MS30000Per provider call
AI_DEFAULT_QUOTA_PER_MINUTE120Fallback tenant quota
OTEL_EXPORTER_OTLP_ENDPOINTtelemetry
PROVIDER_*_KEYsecrets sourced from KMS/Vault

6. Canary & rollback

  • 5% canary for 30 min; promote on SLO green.
  • Kong stable/canary routes support shadow traffic for provider comparison.
  • Full rollback uses prior image + config snapshot; no schema change needed for patch releases.