SMS Orchestrator — Migration Plan
Status: populated Owner: Platform Engineering + SRE Last updated: 2026-04-18 Companion: ADR-0001 Kong · api-gateway MIGRATION_PLAN
1. Context
Per ADR-0001, the custom NestJS api-gateway service is retired. Its responsibilities split:
- Edge concerns (TLS, JWT, key-auth, rate limiting, correlation) → Kong.
- Application concerns (Zod validation, Idempotency-Key storage,
sms.outbound.requestpublish) → this service.
This document covers the orchestrator's side of the cutover.
2. Scope of change in sms-orchestrator
- Add HTTP submit API (
POST /v1/sms/send,/v1/sms/bulk,GET /v1/sms/{id}). - Introduce
orch.idempotency_keystable + Redis keyorch:submit-idem:*. - Publish
sms.outbound.requestdirectly to NATS (previously api-gateway published). - Re-validate
X-Tenant-Idagainst JWT claim (defense in depth; Kong also validated). - Internal metric rename: some
gateway_*metrics now emitted here (compat alias for 30 d).
3. Phased rollout
| Phase | Duration | State |
|---|---|---|
| 0. Preparation | 1 week | Deploy orchestrator with HTTP endpoints disabled (feature-flag off). Kong route config prepared but not cut over. |
| 1. Shadow | 3 days | Kong dual-writes: real traffic to api-gateway, mirror to orchestrator (feature flag on, request-mirror plugin). Compare responses with replay compare tool. |
| 2. Canary | 2 days | Kong routes 5% of /v1/sms/* to orchestrator directly. Watch SLIs. |
| 3. Ramp | 2 days | 25% → 50% → 100% if SLIs hold. |
| 4. Decommission | 1 week | api-gateway traffic drains. Remove Deployment. Remove associated env, secrets, Terraform module. |
4. Data migration
- No historical data migration needed: api-gateway had no domain data; it was stateless.
idempotency_keystable created fresh; no backfill (idempotency window is 48h — short).
5. Contract compatibility
- External API surface unchanged (
POST /v1/sms/sendsame request/response shape). - Clients see no breaking change; only
Serverheader changes.
6. Rollback
- Kong route cutover is reversible (point upstream back to api-gateway Deployment).
- Orchestrator HTTP endpoints guarded by feature flag
SUBMIT_API_ENABLED. - If SLI breach in phase 2 or 3: flip flag, revert Kong route, 10-min RTO.
7. Exit criteria
- api-gateway Deployment removed.
services/api-gateway/repo folder re-scoped to Kong config (done; see api-gateway/SERVICE_OVERVIEW).- Orchestrator owns submit OpenAPI (committed).
- Runbooks point to orchestrator for submit-path incidents.
8. Risks
See R-ORCH-07 in SERVICE_RISK_REGISTER.