smpp-connector — Service Overview
Status: populated Owner: Platform Engineering Last updated: 2026-04-18 Companion: DOMAIN_MODEL · API_CONTRACTS · EVENT_SCHEMAS
1. Purpose
smpp-connector is the SMPP 3.4 client service responsible for all physical communication with Mobile Network Operators (MNOs). It consumes outbound SMS dispatch commands from NATS, transmits submit_sm PDUs over persistent SMPP sessions, receives deliver_sm delivery receipt (DLR) PDUs, and publishes DLR events and operator health state changes back to the platform.
It is the only service in the Ghasi SMS Gateway that holds open TCP connections to external MNO SMPP servers.
2. Bounded Context
| Dimension | Value |
|---|---|
| Domain | SMPP Session Management / PDU Transmission / DLR Handling |
| Owner squad | Platform Engineering |
| Deployment unit | Kubernetes Deployment — smpp-connector (one deployment per operator group, or single deployment with per-operator connection pooling) |
| Communication style | Inbound: NATS JetStream (smpp.operator.{operatorId}) · Outbound: SMPP 3.4 TCP to MNO · Outbound events: NATS (sms.dlr.inbound, operator.health) |
| Storage | PostgreSQL schema smpp (message correlation) · Redis (TPS throttling) |
3. Responsibilities
| # | Responsibility |
|---|---|
| R1 | Establish and maintain SMPP 3.4 bind_transceiver sessions with each assigned operator |
| R2 | Consume smpp.operator.{operatorId} NATS messages and transmit submit_sm PDUs |
| R3 | Enforce per-operator TPS limits via Redis sliding-window rate limiting |
| R4 | Handle long messages: UCS-2 (0x08) + GSM-7 (0x00) encoding, CSMS segmentation or message_payload TLV |
| R5 | Receive deliver_sm DLR PDUs from MNOs and publish sms.dlr.inbound NATS events |
| R6 | Publish operator.health NATS events when session state changes (BOUND/UNBOUND/FAILBACK) |
| R7 | Implement exponential backoff reconnection (5 s → 10 s → 20 s … max 60 s) |
| R8 | Support primary/backup operator failover |
| R9 | Send enquire_link heartbeats every 30 s; mark operator UNBOUND after 10 s timeout |
| R10 | Fetch SMPP credentials from operator-management-service internal API (Vault-backed) |
| R11 | Persist message correlation records in smpp.message_correlations |
| R12 | Expose /health, /metrics, /ready HTTP endpoints for Kubernetes probes |
4. Non-Responsibilities
- Does not decide which operator to use (that is
routing-engine's responsibility) - Does not front HTTP traffic via Kong (SMPP is a binary TCP protocol, not HTTP)
- Does not manage operator configurations or credentials (owned by
operator-management-service) - Does not perform billing calculations (owned by
billing-service) - Does not process inbound MO (Mobile Originated) SMS — only DLR receipts
5. Upstream / Downstream Dependencies
| Direction | Service / System | Protocol | Purpose |
|---|---|---|---|
| Inbound event | sms-orchestrator (via NATS) | NATS JetStream smpp.operator.{operatorId} | Dispatch commands for outbound SMS |
| Outbound PDU | MNO SMPP server | SMPP 3.4 over TCP | Transmit submit_sm, receive deliver_sm |
| Outbound event | dlr-processor (via NATS) | NATS JetStream sms.dlr.inbound | Delivery receipt events |
| Outbound event | routing-engine + operator-management-service (via NATS) | NATS JetStream operator.health | Operator health state changes |
| Outbound API call | operator-management-service | HTTP (internal) | Fetch SMPP credentials for operator |
| Outbound cache | Redis | TCP | TPS sliding-window counters |
| Outbound DB | PostgreSQL smpp schema | TCP (pg driver) | Write/read message correlation records |
6. High-Level Flow
7. Key Design Decisions
| Decision | Rationale |
|---|---|
bind_transceiver with fallback to bind_transmitter/bind_receiver | Transceiver is the most efficient single-connection mode; some older MNOs only support separate TX/RX binds |
NATS subject per operator (smpp.operator.{operatorId}) | Allows independent consumer groups and TPS management per operator; simplifies scaling |
| Redis sliding-window for TPS throttling | Atomic, low-latency; avoids distributed lock complexity; tps:{operatorId}:{windowStart} pattern aligns with standard sliding-window implementations |
| Exponential backoff max 60 s | Prevents thundering-herd reconnect storms against MNO servers; 60 s cap ensures recovery within acceptable SLA |
| Credentials fetched from operator-management-service | Centralises secret rotation; smpp-connector never holds credentials at rest; Vault-backed ensures audit trail |
message_payload TLV for long messages | Some MNOs reject CSMS; TLV is the preferred method for operators that support SMPP 3.4 optional parameters |