SMS Orchestrator — Application Logic
Status: populated Owner: Platform Engineering Last updated: 2026-04-18 Companion: DOMAIN_MODEL · API_CONTRACTS
1. Use Cases
1.1 SubmitSmsUseCase (HTTP entry, moved from retired api-gateway)
Triggered by POST /v1/sms/send from Kong.
1. Extract tenantId/accountId from trusted Kong headers (X-Tenant-Id validated against JWT claim)
2. Parse + Zod-validate body → SubmitSmsCommand
3. Normalize `to` (E.164) and compute segmentCount
4. If Idempotency-Key header present:
a. Hash key (SHA-256) scoped by accountId
b. Redis GET idempotency:{hash}
c. If hit → return stored 202 response
5. Open PG transaction:
a. INSERT sms_messages (status=QUEUED)
b. INSERT idempotency_keys (hash, messageId, response body) if applicable
c. COMMIT
6. Redis SETEX idempotency:{hash} → {messageId, responseBody} EX 172800
7. Publish sms.outbound.request (NATS JetStream, Nats-Msg-Id = messageId)
8. Return 202 { messageId, status: "QUEUED" }
1.2 BulkSubmitSmsUseCase
POST /v1/sms/bulk — array of up to 500 messages. Each element processed through SubmitSmsUseCase logic, then fan-out publish. Returns array of results with per-message status.
1.3 GetMessageStatusUseCase
GET /v1/sms/{messageId} — read-only, returns current row from sms_messages after account-scope check.
1.4 ProcessOutboundRequestUseCase (NATS consumer)
Triggered by consuming sms.outbound.request.
1. Idempotency check: SET NX orch:idem:{messageId} EX 172800
- If existing → ACK and return (duplicate redelivery)
2. Domain validation (invariants not pre-checkable at HTTP stage)
- On failure → persist FAILED → publish deadletter → ACK
3. UPDATE sms_messages SET status = 'ROUTING'
4. gRPC routing-engine.SelectRoute({messageId, tenantId, to, from, messageType})
- On NO_ROUTE_FOUND → FAILED + DLQ
- On transient error → RetryPolicy (increment attempt, NATS NAK with delay)
5. UPDATE sms_messages SET status = 'ROUTED', operator_id, route_id
6. Publish smpp.operator.{operatorId} (the SmppOutboundMessage)
- On publish failure → RetryPolicy
7. UPDATE sms_messages SET status = 'SENT', processed_at = now()
8. Publish sms.events.status (ROUTED → SENT)
9. NATS ACK
1.5 HandleRetryUseCase
Application-level backoff. attempt_count persisted; on service restart, pipeline resumes from stored state.
2. Ports
| Port | Adapter |
|---|---|
SmsMessageRepository | Prisma / Postgres adapter (schema orch) |
IdempotencyStore | Redis adapter (ioredis) |
OperatorRouter | gRPC client adapter to routing-engine |
EventPublisher | NATS JetStream adapter |
Clock | System clock + test override |
3. Orchestration Sequence
Main pipeline described in SERVICE_OVERVIEW §6. Retry and DLQ flows documented in §7 of DOMAIN_MODEL state machine.
4. Concurrency & Ordering
- Consumer is push-type with configurable parallelism (default 16 in-flight per pod).
- Per-
messageIdprocessing is naturally serialized because idempotency SET NX rejects concurrent redelivery. - Same-account ordering is NOT guaranteed by NATS; if ordering is required for a feature (it currently isn't) consumers can be partitioned by account.
5. Observability Hooks
- OTel span per use-case with attributes
sms.message_id,sms.tenant_id,sms.operator_id,sms.attempt. - Pino structured logs with the same fields.
- Prometheus counters:
orch_submit_total{result},orch_pipeline_stage_total{stage,result},orch_retry_total{attempt},orch_dlq_total{reason}.