api-gateway (Kong) — Deployment Topology

Status: populated Owner: TBD (Platform / SRE) Last updated: 2026-04-17 Companion: SERVICE_OVERVIEW · DATA_MODEL · Service Template

1. Runtime

Property	Value
Product	Kong Gateway (edition: OSS or Enterprise — see SERVICE_OVERVIEW open questions)
Version	Pinned; upgrades follow the Kong LTS track
Mode	DB-less (preferred); DB mode is a fallback
Container image	Official `kong:<version>-alpine` (SBOM scanned in CI)
Platform	Kubernetes 1.29+
Workload kind	`Deployment`
Replicas	2 (staging) / 2–6 (prod with HPA)
CPU / mem request	500m / 1 GiB per pod
CPU / mem limit	2000m / 2 GiB per pod
HPA	CPU 70 % target; scale 2–6
Pod disruption budget	minAvailable = 2

DaemonSet layout is an alternative for dedicated ingress nodes; not chosen by default.

2. Topology diagram

3. Configuration delivery (GitOps)

Engineer opens PR modifying ops/kong/<env>.kong.yaml in the application monorepo.
CI runs the contract-test matrix (see TESTING_STRATEGY §2).
On merge to main, CI runs deck gateway sync against staging.
On release tag, CI runs deck gateway sync against production, with an approval gate.
Nightly deck diff job detects drift.

Kong pods in DB-less mode watch a ConfigMap (mounted as the declarative_config file). deck writes the ConfigMap; Kong reloads on SIGHUP triggered by the config watcher sidecar (or rolling restart if SIGHUP reload is not configured).

4. Cloudflare upstream

Cloudflare sits in front as WAF + DDoS + CDN.
Cloudflare → Kube LoadBalancer → Kong pods.
TLS: Cloudflare holds the public cert; origin cert (Cloudflare Authenticated Origin Pulls) terminates at Kong.
Cloudflare applies bot management, IP reputation, basic rate limiting (L7) on top of Kong's finer-grained limits.

5. Environments

Env	Domain	Replicas	Redis	DB mode
`dev`	n/a (docker compose)	1	local	DB-less
`staging`	`api.staging.ghasi.io`	2	shared staging Redis	DB-less
`prod`	`api.ghasi.io`	2–6 (HPA)	dedicated cluster	DB-less (see open Qs)
`dr`	`api.dr.ghasi.io`	2 hot-standby	regional replica	DB-less

6. Regions

Primary: single region (e.g. eu-west-1) — same region as sms-orchestrator and Redis.
DR: warm-standby region, Kong config replicated via Git. RTO 4 h, RPO 1 h (matches platform baseline, see 01 Enterprise Architecture §10).

7. Networking

Concern	Setting
Public ingress	Cloudflare → cloud LB (L4 or L7) → Kong Service
Kong Service type	`LoadBalancer` (public subnet) or `NodePort` behind an external LB
Admin API	Internal-only Service; NetworkPolicy restricts to SRE/CI pods
Upstream calls	Cluster DNS; mTLS preferred (via service mesh or direct)
Egress	Allow-list: `auth-service`, Redis, OTel collector, Loki

NetworkPolicy excerpts:

Deny all → Kong admin port except from role=sre-tooling pods.
Kong pods may egress to cluster DNS, named services, and the OTel/Loki endpoints only.

8. Dependencies at runtime

Dependency	Failure mode
`auth-service` (JWKS + key resolution)	Degrades JWT validation after cache TTL; custom plugin rejects new API keys
Redis	Rate-limit counters fail per route policy (closed on writes, open on reads)
OTel collector	Traces dropped; alerts on sustained export failure
Loki	Log buffer fills then drops; not request-path critical
Prometheus	No impact on request path

9. Upgrade strategy

Config changes: rolling pod restart if SIGHUP reload not configured; else live reload.
Kong version bumps: blue/green. Stand up a new ReplicaSet with the new image, flip Service selector after smoke test, keep old ReplicaSet for 1 h.
Custom plugin releases: ship as part of the Kong image (build-time install). Roll with blue/green.

10. Resource sizing (reference)

For an SMS peak of ~5 000 req/s at the edge with p95 Kong latency < 150 ms:

4 Kong pods @ 1 vCPU each, ~60 % CPU headroom.
Redis sustained throughput ~30 k ops/s (rate limit INCR + EXPIRE). 3-node cluster, 4 GiB each.

Final numbers ratified at load test (see TESTING_STRATEGY §7).

11. Open questions

DaemonSet vs Deployment on dedicated ingress nodes — revisit if ingress traffic grows significantly.
Service mesh (Linkerd / Istio) for Kong ↔ upstream mTLS vs direct TLS.
HPA metric — CPU only or custom metric (e.g. kong_http_requests_total rate).

1. Runtime​

2. Topology diagram​

3. Configuration delivery (GitOps)​

4. Cloudflare upstream​

5. Environments​

6. Regions​

7. Networking​

8. Dependencies at runtime​

9. Upgrade strategy​

10. Resource sizing (reference)​

11. Open questions​