Skip to main content

SMS Orchestrator — Application Logic

Status: populated Owner: Platform Engineering Last updated: 2026-04-18 Companion: DOMAIN_MODEL · API_CONTRACTS

1. Use Cases

1.1 SubmitSmsUseCase (HTTP entry, moved from retired api-gateway)

Triggered by POST /v1/sms/send from Kong.

1. Extract tenantId/accountId from trusted Kong headers (X-Tenant-Id validated against JWT claim)
2. Parse + Zod-validate body → SubmitSmsCommand
3. Normalize `to` (E.164) and compute segmentCount
4. If Idempotency-Key header present:
a. Hash key (SHA-256) scoped by accountId
b. Redis GET idempotency:{hash}
c. If hit → return stored 202 response
5. Open PG transaction:
a. INSERT sms_messages (status=QUEUED)
b. INSERT idempotency_keys (hash, messageId, response body) if applicable
c. COMMIT
6. Redis SETEX idempotency:{hash} → {messageId, responseBody} EX 172800
7. Publish sms.outbound.request (NATS JetStream, Nats-Msg-Id = messageId)
8. Return 202 { messageId, status: "QUEUED" }

1.2 BulkSubmitSmsUseCase

POST /v1/sms/bulk — array of up to 500 messages. Each element processed through SubmitSmsUseCase logic, then fan-out publish. Returns array of results with per-message status.

1.3 GetMessageStatusUseCase

GET /v1/sms/{messageId} — read-only, returns current row from sms_messages after account-scope check.

1.4 ProcessOutboundRequestUseCase (NATS consumer)

Triggered by consuming sms.outbound.request.

1. Idempotency check: SET NX orch:idem:{messageId} EX 172800
- If existing → ACK and return (duplicate redelivery)
2. Domain validation (invariants not pre-checkable at HTTP stage)
- On failure → persist FAILED → publish deadletter → ACK
3. UPDATE sms_messages SET status = 'ROUTING'
4. gRPC routing-engine.SelectRoute({messageId, tenantId, to, from, messageType})
- On NO_ROUTE_FOUND → FAILED + DLQ
- On transient error → RetryPolicy (increment attempt, NATS NAK with delay)
5. UPDATE sms_messages SET status = 'ROUTED', operator_id, route_id
6. Publish smpp.operator.{operatorId} (the SmppOutboundMessage)
- On publish failure → RetryPolicy
7. UPDATE sms_messages SET status = 'SENT', processed_at = now()
8. Publish sms.events.status (ROUTED → SENT)
9. NATS ACK

1.5 HandleRetryUseCase

Application-level backoff. attempt_count persisted; on service restart, pipeline resumes from stored state.

2. Ports

PortAdapter
SmsMessageRepositoryPrisma / Postgres adapter (schema orch)
IdempotencyStoreRedis adapter (ioredis)
OperatorRoutergRPC client adapter to routing-engine
EventPublisherNATS JetStream adapter
ClockSystem clock + test override

3. Orchestration Sequence

Main pipeline described in SERVICE_OVERVIEW §6. Retry and DLQ flows documented in §7 of DOMAIN_MODEL state machine.

4. Concurrency & Ordering

  • Consumer is push-type with configurable parallelism (default 16 in-flight per pod).
  • Per-messageId processing is naturally serialized because idempotency SET NX rejects concurrent redelivery.
  • Same-account ordering is NOT guaranteed by NATS; if ordering is required for a feature (it currently isn't) consumers can be partitioned by account.

5. Observability Hooks

  • OTel span per use-case with attributes sms.message_id, sms.tenant_id, sms.operator_id, sms.attempt.
  • Pino structured logs with the same fields.
  • Prometheus counters: orch_submit_total{result}, orch_pipeline_stage_total{stage,result}, orch_retry_total{attempt}, orch_dlq_total{reason}.