Channel Router Service — Security Model
Version: 1.0 Status: Draft Owner: Messaging Core + Platform Security Last Updated: 2026-04-21 Companion: SERVICE_OVERVIEW · DATA_MODEL · API_CONTRACTS Related: ADR-0004 §11–§12 (sovereignty, mesh identity),
docs/standards/SECURITY_BASELINE.md
The channel-router is a hot-path data-plane service handling OTT credentials, raw MSISDN, message bodies in transit to providers, and tenant HMAC secrets. Compromise implies the ability to (a) send messages under any tenant's sender-ID, (b) exfiltrate national opt-out metadata, or (c) spoof MO traffic to tenant webhooks. Security posture is default-deny, fail-closed on auth, PII-minimising.
1. Authentication
1.1 Inter-service (mTLS via SPIRE)
| Caller | Identity (SPIFFE ID) | Required scope |
|---|---|---|
sms-orchestrator | spiffe://ghasi.af/ns/np-data/sa/sms-orchestrator | rpc:channel.route |
admin-dashboard backend | spiffe://ghasi.af/ns/np-ctrl/sa/admin-dashboard | rpc:channel.admin.read, rpc:channel.admin.write |
tenant-portal backend | spiffe://ghasi.af/ns/np-ctrl/sa/tenant-portal | rpc:channel.tenant.* |
webhook-dispatcher | spiffe://ghasi.af/ns/np-data/sa/webhook-dispatcher | rpc:channel.mo.internal |
compliance-engine | spiffe://ghasi.af/ns/np-data/sa/compliance-engine | (consumer only; no gRPC caller path) |
All gRPC ports (:50071, :50072) require client-cert with SPIFFE ID matching the allow-list. SVID rotation every 1 h via SPIRE. Unauthenticated TCP is rejected at the mesh sidecar (Envoy).
1.2 REST (Kong-fronted, JWT)
- JWT signed by
auth-service(RS256); public keys refreshed every 60 min. - Claims required:
sub,tenantId(for tenant-scoped endpoints),roles[],exp. - Kong plugins:
jwt,rate-limiting-advanced,ip-restriction(tenant-admin endpoints limited to tenant-registered IP CIDRs).
1.3 OTT provider webhook ingress
| Provider | Auth mechanism |
|---|---|
| WhatsApp Cloud | X-Hub-Signature-256: sha256=<hex(HMAC_SHA256(appSecret, rawBody))>; compare constant-time; rejected on mismatch with 401 SIGNATURE_INVALID |
| Telegram | Secret path component (/v1/webhooks/telegram/{secretPath}) + optional X-Telegram-Bot-Api-Secret-Token header |
| Viber | X-Viber-Content-Signature: HMAC_SHA256(authToken, rawBody) |
Signatures are verified before body parse. The appSecret / authToken is pulled from Vault per-tenant; constant-time comparison prevents timing side-channel. Invalid-signature counter chan_webhook_signature_invalid_total{provider} — alert on rate > 10/min.
2. Authorisation (RBAC)
| Role | Scope | Endpoints |
|---|---|---|
platform.channel.admin | Platform-wide | All /v1/channel/* endpoints except tenant portal |
platform.support | Platform-wide read-only | GET-only; body always masked |
tenant.admin | tenantId scope | Fallback policy CRUD, inbound-route CRUD, webhook secret rotation |
tenant.support | tenantId scope | Session inspector, profile read (hashed MSISDN only) |
tenant.developer | tenantId scope | Sandbox DeliverNow, webhook URL update only |
citizen | Self only (MSISDN-OTP verified) | GET /v1/channel/citizen/profile only |
RBAC enforced in Kong (coarse) + NestJS guard (fine), plus Postgres RLS for defence-in-depth:
CREATE POLICY profile_tenant_scope ON chan.recipient_profiles
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);
Handlers set SET LOCAL app.current_tenant_id = :jwt.tenantId per request inside a dedicated PG transaction. A cross-tenant lookup query returns zero rows, not an error.
3. Data protection
3.1 PII classification
| Field | Class | Protection |
|---|---|---|
msisdn (raw) | CONFIDENTIAL-PII | Accepted on gRPC; hashed within 10 µs; never stored raw in PG; never logged |
msisdn_hash | SENSITIVE | Salted SHA-256; salt per-tenant in Vault; rotatable but deterministic |
body | SENSITIVE | Transits to providers; not stored in analytics; delivery_attempts.raw_provider_payload retains redacted snippet 30 d |
tenant_inbound_routes.secret_ref | SECRET | Vault path only; plaintext secret never touches PG or logs |
channel_adapter_configs.secret_ref | SECRET | Same |
conversations.msisdn_hash | SENSITIVE | Same |
audit.before / after | MIXED | Pre-redact MSISDN, OTP codes, secrets before write |
3.2 Encryption
- In transit: all inter-service = mTLS (SPIRE SVID, TLS 1.3); all provider egress = TLS 1.2+; tenant-webhook egress = TLS 1.2+ with cert pinning optional per tenant.
- At rest: PG data volumes encrypted via LUKS (dm-crypt); sensitive JSONB columns (
adapter_config_id.secret_ref) reference Vault paths (no envelope encryption needed at DB level). - Backups: encrypted with per-environment KMS key; cold archive (S3) uses per-tenant DEK wrapped by HSM KEK.
3.3 Secret storage
| Secret | Storage | Rotation |
|---|---|---|
| OTT provider tokens (WhatsApp, Telegram, Viber) | Vault secrets/data/chan/ott/{tenantId}/{provider} | 60 s propagation via chan.ott_account.rotated.v1 |
| Tenant HMAC webhook secrets | Vault secrets/data/chan/webhook/{tenantId}/{inbound} | 24 h grace accepting old+new |
| DB credentials | Vault dynamic secrets; 1 h TTL | Auto-rotated per Vault policy |
| Meta app-secret (for webhook signature) | Vault secrets/data/chan/meta/app_secret | Rotated per-quarter; manual |
| Triton inference client token | Vault | Short-lived; SPIFFE-based |
4. MSISDN hashing
msisdnHash = sha256(lowercase(E.164) || ":" || tenantSalt)
tenantSaltis per-tenant, fetched from Vault at pod startup and cached 60 min.- Same MSISDN → different hash across tenants (prevents cross-tenant linkage).
- Hash is one-way; reverse lookup is impossible without the salt and brute force of MSISDN space — infeasible for Afghan numbering plan (~30 M MSISDNs) if salt is compromised. Mitigation: salt rotation procedure triggers full hash re-computation via cold-path backfill.
5. Consent enforcement (hot-path security property)
RouteWithFallback refuses dispatch when consent-ledger-service.CheckConsent cannot be satisfied:
- Channel missing consent → excluded from ladder with reason
recipient_opt_out. - consent-ledger unreachable past 10 ms deadline and cache entry missing →
REFUSED_CONSENT_UNKNOWN(fail-closed). - Consent-check cache TTL 60 s; invalidated by
consent.revoked.v1.
Attack defence: an attacker who bypasses sms-orchestrator and calls channel-router directly (requires compromising mesh SVID) still cannot send to opted-out recipients.
6. Audit (append-only, hash-chained)
All state-changing operations emit channel.audit.v1 and persist an chan.audit row:
record_hash = sha256( canonical_json(payload) || prev_hash )
Daily cron chan.audit.verifier verifies the previous 24 h chain; break → ChannelAuditChainBroken (Critical). Retention 13 m hot + 7 y cold.
Examples of audited actions:
- Fallback-policy edits
- Adapter credential rotation
- Manual circuit-breaker actions
- Manual session close
- Inbound-route changes
- Webhook secret rotation
7. Fail-closed behaviours
| Condition | Behaviour |
|---|---|
| consent-ledger unreachable + cache miss | REFUSED_CONSENT_UNKNOWN |
| compliance-engine unreachable + cache miss | REFUSED_COMPLIANCE_UNKNOWN |
| sender-id-registry unreachable + cache miss | REFUSED_SENDER_UNAUTHORIZED |
| Postgres unavailable (write path) | gRPC UNAVAILABLE; orchestrator redelivers |
| Vault unavailable at pod start | Pod crashes CrashLoopBackoff (no booting without credentials) |
CHAN_EXTERNAL_LLM_ENABLED=true on startup | Pod refuses to boot (sovereignty guard) |
| Provider webhook signature mismatch | 401; no body parse; no correlation |
Fail-degraded (not fail-closed) for adapter-level issues:
- OTT provider API down → breaker opens → ladder-step skip (not refuse)
- Redis unavailable → PG direct (higher latency; correctness preserved)
8. NetworkPolicy
Ingress:
:50071— only fromnp-datamesh sidecars with allowed SPIFFE IDs (see §1.1).:50072— only fromadmin-dashboardandtenant-portalmesh pods.:3071— only from Kong ingress (np-edge).:9061metrics — only fromprometheuspod innp-obs.
Egress:
- Postgres
chanschema innp-data. - Redis Sentinel in
np-data. - NATS in
np-data. - Vault in
np-ctrl. - OTT provider HTTPS:
graph.facebook.com:443(WhatsApp Cloud) — allow-listed egress IP poolapi.telegram.org:443chatapi.viber.com:443
- Voice OTP gateway (internal gRPC).
- SMTP egress (mail-egress IP pool).
- Tenant webhooks via
webhook-dispatcher(channel-router does not egress tenant webhooks directly — separation of concerns).
All egress traverses a forward proxy with outbound DLP rules (no PII may leave to non-allow-listed domains).
9. Threat model
| Threat | Impact | Mitigation |
|---|---|---|
| Attacker compromises OTT credential in Vault | Send-any-message as tenant | Vault + HSM KEK; per-tenant separation; 60 s rotation propagation |
| Spoofed provider webhook | False delivered status; billing fraud | HMAC signature verification; constant-time compare |
| MSISDN enumeration via profile endpoint | Privacy loss | RLS + tenant-scope JWT; profile endpoint returns 404 on cross-tenant; rate-limited (US-CHAN per-tenant cap) |
| Tenant webhook URL hijack | MO body exfiltration | webhook-dispatcher enforces HTTPS-only + cert-pinning option; DNS-TTL monitoring; regular URL revalidation |
Malicious MO body reaches tenant unmasked | XSS on tenant dashboard | Tenant responsibility; platform forwards raw body to the tenant's webhook only; no platform-side UI renders MO body unredacted |
| Cross-tenant profile read via SQLi | PII leak | Parameterised queries only; RLS defence-in-depth; Static application security testing gate in CI |
| Unauthorised policy change → cost attack | Tenant billed for unexpected OTT fallback | tenant.admin role required; audit trail; alert on ladder-length increase > 2 steps or costCapPerMessage increase > 3× |
| Insider abuse: operator sends admin messages | Unauthorised outbound | platform.channel.admin actions audited; dual-approval via Jira required for cross-tenant DeliverNow |
| OTT ToS violation (e.g. WhatsApp) | Provider account suspension | Circuit-breaker on 429/403 spikes; fraud-intel channel-abuse signals force breaker open |
10. Compliance alignment
- Data sovereignty (ADR-0004 §11): all data at rest and all inference stays within Afghan-hosted infra; startup guard
CHAN_EXTERNAL_LLM_ENABLED=false. - GDPR-style erasure (tenant-scoped per Afghan data-protection regulation): on
consent.erasure.requested.v1, the service (a) deletesrecipient_profiles, (b) tokenisesmsisdn_hashinconversationsanddelivery_attempts, (c) purges cached entries. - Audit (regulator-defensibility): append-only hash-chained per-region; 13 m hot + 7 y cold.
- WhatsApp Business Policy: enforced via
compliance-enginetemplate-approval state — channel-router refuses to dispatch a WhatsApp step on a rejected template. - Telegram ToS (user-initiated): channel-router dispatches Telegram only when
telegram_chat_idexists andlast_seen_atwithin 30 d.