Skip to main content

SMS Orchestrator — Migration Plan

Status: populated Owner: Platform Engineering + SRE Last updated: 2026-04-18 Companion: ADR-0001 Kong · api-gateway MIGRATION_PLAN

1. Context

Per ADR-0001, the custom NestJS api-gateway service is retired. Its responsibilities split:

  • Edge concerns (TLS, JWT, key-auth, rate limiting, correlation) → Kong.
  • Application concerns (Zod validation, Idempotency-Key storage, sms.outbound.request publish) → this service.

This document covers the orchestrator's side of the cutover.

2. Scope of change in sms-orchestrator

  1. Add HTTP submit API (POST /v1/sms/send, /v1/sms/bulk, GET /v1/sms/{id}).
  2. Introduce orch.idempotency_keys table + Redis key orch:submit-idem:*.
  3. Publish sms.outbound.request directly to NATS (previously api-gateway published).
  4. Re-validate X-Tenant-Id against JWT claim (defense in depth; Kong also validated).
  5. Internal metric rename: some gateway_* metrics now emitted here (compat alias for 30 d).

3. Phased rollout

PhaseDurationState
0. Preparation1 weekDeploy orchestrator with HTTP endpoints disabled (feature-flag off). Kong route config prepared but not cut over.
1. Shadow3 daysKong dual-writes: real traffic to api-gateway, mirror to orchestrator (feature flag on, request-mirror plugin). Compare responses with replay compare tool.
2. Canary2 daysKong routes 5% of /v1/sms/* to orchestrator directly. Watch SLIs.
3. Ramp2 days25% → 50% → 100% if SLIs hold.
4. Decommission1 weekapi-gateway traffic drains. Remove Deployment. Remove associated env, secrets, Terraform module.

4. Data migration

  • No historical data migration needed: api-gateway had no domain data; it was stateless.
  • idempotency_keys table created fresh; no backfill (idempotency window is 48h — short).

5. Contract compatibility

  • External API surface unchanged (POST /v1/sms/send same request/response shape).
  • Clients see no breaking change; only Server header changes.

6. Rollback

  • Kong route cutover is reversible (point upstream back to api-gateway Deployment).
  • Orchestrator HTTP endpoints guarded by feature flag SUBMIT_API_ENABLED.
  • If SLI breach in phase 2 or 3: flip flag, revert Kong route, 10-min RTO.

7. Exit criteria

  • api-gateway Deployment removed.
  • services/api-gateway/ repo folder re-scoped to Kong config (done; see api-gateway/SERVICE_OVERVIEW).
  • Orchestrator owns submit OpenAPI (committed).
  • Runbooks point to orchestrator for submit-path incidents.

8. Risks

See R-ORCH-07 in SERVICE_RISK_REGISTER.