Skip to main content

api-gateway (Kong) — Service Overview

Status: populated Owner: TBD (Platform / SRE) Last updated: 2026-04-17 Companion: ADR-0001 Kong edge gateway · 01 Enterprise Architecture · Service Template

1. Purpose

This folder documents the Kong Gateway deployment that fronts the Ghasi-SMS-Gateway platform. Per ADR-0001, the previously-planned custom NestJS api-gateway service is RETIRED. Its edge responsibilities — TLS termination, authentication, rate limiting, correlation, logging — are now performed by Kong; its pre-edge application concerns (payload validation, idempotency, NATS publish) moved to sms-orchestrator.

The services/api-gateway/ directory is therefore not a deployable NestJS service. It is the source-of-truth documentation for:

  • Kong Services and Routes (the public API surface).
  • Kong plugins enabled per route (auth, rate limiting, correlation, logging, OTel).
  • Any custom Kong plugins Ghasi authors (implementation lives in the application monorepo under ops/kong/plugins/; design and contract live here).
  • Operational posture for Kong in staging and production (topology, observability, runbook entry points).

2. Why Kong replaces the custom api-gateway

ConcernCustom NestJS gateway (retired)Kong (adopted)
TLS terminationNestJS + cert-managerCloudflare + Kong, battle-tested
JWT validationHand-rolled Firebase verifyjwt plugin + JWKS from auth-service
API key authHand-rolled hash + DB lookupkey-auth plugin (+ custom plugin to resolve consumer from auth-service)
Rate limitingioredis + Lua-ish countersrate-limiting / rate-limiting-advanced (Redis cluster backend)
Request size limitsNestJS body parser configrequest-size-limiting plugin
IP allow/denyHand-rolled guardip-restriction plugin
Correlation IDs, OTel spansCustom interceptorcorrelation-id + opentelemetry plugins
ObservabilityCustom Prom registryKong Prometheus plugin + http-log to Loki
Operational costMaintain a whole NestJS serviceOne declarative decK YAML, Helm values

Rationale: telecom-grade SMS traffic needs first-class per-key / per-account / per-operator rate limits and burst protection. Kong's mature plugin set, declarative config, and operational tooling beat a bespoke NestJS edge for correctness, security posture, and maintenance cost.

3. Route prefixes exposed by Kong

The table below is the authoritative route layout, copied from ADR-0001 §3. It is expanded per endpoint in API_CONTRACTS.

Path prefixUpstream serviceAuth at edge
/v1/sms/send, /v1/sms/{id}, /v1/sms/bulksms-orchestratorJWT or API key
/v1/dlr/*dlr-processorJWT
/v1/accounts/*, /v1/auth/*, /v1/api-keys/*auth-serviceJWT (most); /v1/auth/login public
/v1/billing/*, /v1/invoices/*billing-serviceJWT
/v1/analytics/*, /v1/reports/*analytics-serviceJWT
/v1/operators/*operator-management-serviceJWT + admin scope
/v1/webhooks/*webhook-dispatcherJWT
/admin/*admin-dashboard BFFJWT + admin scope + IP allow
/portal/*customer-portal BFFJWT

Internal east-west traffic (service-to-service HTTP/gRPC, NATS) bypasses Kong. SMPP ingress terminates at smpp-connector, not Kong (ADR §7).

4. Responsibilities owned by Kong

  • TLS 1.2+ termination downstream from Cloudflare (Cloudflare remains the WAF/DDoS layer).
  • Edge authentication — JWT validation against auth-service JWKS; API key validation via key-auth plugin; optional custom plugin to map API key → consumer using a cached auth-service lookup.
  • Rate limiting — per API key, per account, per operator, and global. Redis-backed (kong:rl:* namespace).
  • Request size limiting — default 64 KB for /v1/sms/send; larger per-route overrides for bulk endpoints.
  • IP allow/deny — partner integration allowlists, admin-path IP restrictions.
  • Correlation and tracing — inject X-Request-Id if missing, propagate traceparent, emit OTel spans.
  • Header forwardingAuthorization, X-Tenant-Id, X-Api-Key-Id, Idempotency-Key, Accept-Language, X-Forwarded-For.
  • Access logging — headers and metadata only; never SMS message bodies (PII/telecom data).

5. Responsibilities Kong does not own

ConcernOwnerWhy
Business authorization (account scope, per-resource RBAC)auth-service + in-service guardsKong does coarse gating only
Tenant isolation (account_id scoping, RLS)Every serviceAuthoritative boundary is the DB
Idempotency storage and replaysms-orchestratorRequires business context (dedupe key scope = API key + endpoint)
Zod payload validation (phone E.164, content-type detection, operator lookup)sms-orchestratorBusiness rule correctness
Problem+json error shapingEvery upstream serviceConsistent with OpenAPI
HMAC signing of outbound webhookswebhook-dispatcherNot a north-south concern
Domain eventsUpstream services → NATSKong never emits domain events

6. Upstream dependencies

DependencyPatternPurpose
auth-service (JWKS endpoint + api_keys lookup)Pull JWKS at startup + refresh; optional sidecar plugin HTTP callValidate JWTs; resolve API keys to consumers
Redis clusterkong:rl:* namespaceRate-limiter counters
PostgreSQL (if DB mode) or decK YAML in Git (DB-less)Config storeRoute/plugin/consumer definitions
CloudflareUpstream CDN/WAFSends already-TLS-offloaded or TLS-passthrough traffic
OTel collectorTrace exportSpans to platform trace backend
Lokihttp-log targetAccess logs (headers only)
PrometheusMetrics scrapeKong built-in Prometheus plugin

7. Architecture diagram

8. Key decisions

  1. Adopt Kong as the only north-south HTTP gateway. See ADR-0001.
  2. DB-less mode preferred for production; configuration is declarative YAML under ops/kong/ in the application monorepo, applied via decK in CI. DB mode is a fallback if Kong Enterprise features requiring the DB are adopted later.
  3. Authoritative auth stays in services. Kong performs coarse gating (JWT shape, API key existence, rate limit). Business authorization (account scope, resource RBAC, tenant isolation) remains in upstream services.
  4. SMPP is out of scope. MNO bindings terminate at smpp-connector; Kong does not proxy SMPP.
  5. No domain events. Kong emits access logs, metrics, and traces — no NATS events (ADR §4, EVENT_SCHEMAS).
  6. Custom plugins avoided where possible. Prefer configuration of built-in plugins; a custom plugin is only justified for API-key → consumer resolution against auth-service when the built-in key-auth + consumer bootstrap is insufficient.

9. Service readiness level

LevelDescriptionTarget
L1Kong deployed in staging, DB-less, one smoke route greenSlice 0
L2All route prefixes wired, JWT + key-auth + rate limit + OTel enabled; Grafana dashboards liveSlice 1
L3HA (2–6 replicas), decK pipeline with lint against upstream OpenAPI, alerts wired, runbooks signedSlice 2
L4Production cutover complete, custom NestJS api-gateway decommissioned, chaos-testedSlice 3

10. Open questions

  • Kong edition: OSS vs Enterprise vs Konnect (SRE decision; see SERVICE_RISK_REGISTER).
  • Rate limiter backend: shared platform Redis vs Kong-dedicated Redis cluster.
  • DB-less vs DB: final call before production cutover.
  • Custom plugin for API-key → consumer resolution: built-in key-auth with consumer-sync job or a purpose-built plugin querying a cached auth-service endpoint.
  • Certificate rotation cadence and automation (cert-manager vs Cloudflare origin cert).