Skip to main content

ADR 0001: Kong as north-south API gateway (Ghasi-SMS-Gateway)

Status

Accepted — 2026-04-17.

Context

Ghasi-SMS-Gateway exposes NestJS microservices behind REST (SMS submit, account management, analytics), HMAC-signed outbound webhooks, and internal management consoles (admin-dashboard, customer-portal). External consumers — customer applications sending SMS over our public API, partner integrations, and our own frontends — require a single, consistent entry point for TLS termination, traffic management, rate limiting, authentication at the edge, and observability.

Earlier platform drafts (see 01 — Enterprise Architecture revision 1.0) described a NestJS-hosted api-gateway service that performed authentication, rate limiting, idempotency checks, input validation, and publishing to NATS. Operating experience, security requirements, and the maturity of dedicated gateway products make a purpose-built gateway in front of services a better fit than a custom NestJS edge proxy:

  • Telecom-grade SMS traffic has strict TPS limits per account/operator and burst protection needs that are first-class in Kong's rate-limiting plugins but require custom code in NestJS.
  • JWT/OAuth2 validation, API key authentication, IP allow/deny, bot detection, mTLS for partner routes, request size limits are all battle-tested Kong plugins.
  • OpenAPI-driven routing can be declaratively maintained via decK or Helm, versioned with the service registry, and linted in CI against upstream OpenAPI contracts.
  • Kong's admin API, plugin SDK, and operational tooling (metrics, health, traces) are production-ready; a hand-rolled NestJS gateway would reinvent all of this.

This ADR records the decision to standardize on Kong Gateway as the only supported north-south HTTP ingress for customer-facing and integration traffic on the Ghasi-SMS-Gateway platform.

Decision

  1. Deploy Kong Gateway as the front door for all external HTTP(S) access to platform APIs. No production client (customer application, partner, admin-dashboard frontend, customer-portal frontend) shall call individual service load balancers directly; they use the Kong base URL and documented route prefixes that map to the correct upstream service.

  2. Retire the custom NestJS api-gateway service as described in its pre-Kong design. Its responsibilities are redistributed as follows:

    Former api-gateway responsibilityNew owner
    TLS terminationCloudflare + Kong
    Platform JWT / API key authenticationKong (JWT + key-auth plugins; JWKS pulled from auth-service). The platform JWT is issued by auth-service regardless of which IdP (Keycloak default, tenant-federated OIDC/SAML, or legacy Firebase) authenticated the session.
    Rate limiting (per account, per API key, global)Kong (rate-limiting + rate-limiting-advanced plugins, backed by Redis)
    Request size limits, IP allow/denyKong (request-size-limiting, ip-restriction plugins)
    Request/response logging (headers only, no bodies)Kong (http-log plugin to Loki)
    Header propagation (X-Request-Id, traceparent, X-Tenant-Id)Kong (correlation-id, opentelemetry plugins)
    Zod payload validation for POST /v1/sms/sendsms-orchestrator (authoritative; Kong may enforce JSON schema coarsely)
    Idempotency-Key check/storesms-orchestrator (Redis-backed in its own namespace)
    Publishing sms.outbound.request to NATSsms-orchestrator (now receives the HTTP request directly through Kong)
    Admin/portal BFF aggregationadmin-dashboard BFF / customer-portal BFF (thin NestJS layers)

    The remaining api-gateway/ folder in the documentation repo is re-scoped to "Kong routing, plugins, and custom plugin layer" — it documents route config, plugin policies, custom Kong plugins (if any), and edge observability. It is no longer a deployable NestJS service.

  3. Routing: Kong Services and Routes mirror the public API layout in 05 — API Design:

    Path prefixUpstream
    /v1/sms/send, /v1/sms/{id}, /v1/sms/bulksms-orchestrator
    /v1/dlr/*dlr-processor
    /v1/accounts/*, /v1/auth/*, /v1/api-keys/*auth-service
    /v1/billing/*, /v1/invoices/*billing-service
    /v1/analytics/*, /v1/reports/*analytics-service
    /v1/operators/* (admin only)operator-management-service
    /v1/webhooks/*webhook-dispatcher
    /admin/*admin-dashboard BFF
    /portal/*customer-portal BFF

    Route definitions are maintained as infrastructure-as-code (decK YAML or Helm chart) under ops/kong/ in the application monorepo, reviewed alongside any change to a service's OpenAPI.

  4. Responsibilities at Kong (non-exhaustive):

    • TLS termination (downstream from Cloudflare, which remains the WAF + DDoS layer).
    • Edge authentication: JWT validation (JWKS from auth-service), API key validation (key-auth plugin keyed against auth-service's api_keys table via a Kong plugin that queries a cached endpoint).
    • Rate limiting: per-API-key, per-account, per-operator, and global — backed by a shared Redis cluster (dedicated namespace kong:rl:*, distinct from service-owned Redis keys).
    • Request size limits (default 64 KB for /v1/sms/send, larger for bulk endpoints; configurable per route).
    • IP allow/deny lists for partner integrations and admin paths.
    • Correlation IDs — inject X-Request-Id if missing, propagate traceparent.
    • Header forwarding: Authorization, X-Tenant-Id, X-Api-Key-Id, Idempotency-Key, Accept-Language, X-Forwarded-For.
    • Request/response logging (headers + metadata only; never SMS message bodies — PII).
    • OpenTelemetry export of Kong spans into the platform trace collector so upstream service spans chain to edge spans.
  5. Responsibilities remain in microservices:

    • Business authorization (per-account scope checks, RBAC roles beyond coarse "authenticated") — auth-service + in-service guards.
    • Tenant isolation (account_id scoping in every query; RLS policies on PostgreSQL tables) — every service.
    • Idempotency storage and replay (Idempotency-Key TTL in Redis, response replay on duplicate key) — sms-orchestrator and any other write endpoint that accepts the header.
    • Business validation (Zod schemas for SMS payloads, phone-number E.164 normalization, content-type detection, destination operator lookup) — sms-orchestrator.
    • Problem+json error shaping — every service.

    Kong may enforce coarse presence of required JWT claims (tenant, scope) where the policy is stable; services remain the authority for correctness.

  6. Internal east-west traffic (service-to-service NestJS/NATS/gRPC) bypasses Kong. Only north-south and explicit partner ingress flow through Kong unless a future ADR documents an exception.

  7. SMPP ingress (carrier → smpp-connector) is out of scope for Kong. SMPP is not HTTP; MNO bindings terminate at dedicated smpp-connector listeners with their own TLS/authentication via SMPP bind credentials. Kong does not proxy SMPP.

Consequences

  • Positive: One place to enforce TLS, rate limiting, JWT validation, and API-key gating. Onboarding integrators becomes "one base URL, documented paths, versioned OpenAPI." Aligns with common SaaS operating models.
  • Positive: Application teams stop maintaining a bespoke NestJS edge proxy; the team saves ~1 service worth of maintenance burden.
  • Positive: Rate limiting, the hardest correctness concern for an SMS platform (operator-level TPS, per-account quotas, abuse control), uses Kong's battle-tested plugins instead of hand-rolled Redis counters.
  • Negative: One additional moving part (Kong control plane + data plane + configuration store or DB-less mode + upgrades + DR). Route config must stay in sync with service OpenAPI; CI must lint Kong route definitions against upstream OpenAPI contracts.
  • Negative: Debugging requires correlation IDs across Kong and upstreams; observability must include Kong spans, logs, and metrics in Grafana dashboards alongside upstream services.
  • Migration: Environments that currently expose api-gateway behind a Cloudflare → NestJS ingress must plan a cutover: dual-running Kong and api-gateway behind Cloudflare during a bounded migration window, switching client base URLs (or DNS) at cutover, then decommissioning api-gateway. See the migration plan in services/api-gateway/MIGRATION_PLAN.md.

Non-goals

  • Replacing service-level authorization, RLS, or tenant isolation. Kong does coarse gating only; authoritative checks remain in services.
  • Proxying SMPP carrier bindings through Kong. SMPP ingress terminates at smpp-connector.
  • Prescribing Kong Enterprise vs OSS vs Konnect edition. That is an implementation detail left to SRE during deployment planning; this ADR commits only to Kong as the gateway product.

References