Skip to main content

sms-firewall-service — Service Risk Register

Version: 1.0 Status: Draft Owner: Trust and Safety + Security + SRE Last Updated: 2026-04-21 References: FAILURE_MODES.md, SECURITY_MODEL.md, ADR-0004

Known service-level risks with owners, mitigations, and residual-risk classification. Scored 1–5 Likelihood × Impact; residual must be ≤ Medium for GA.


1. Risk Summary

IDRiskCategoryLikelihoodImpactPre-mitigationResidualOwner
FW-RISK-01False-positive BLOCKs drop legitimate OTP trafficCorrectness45CriticalMediumTrust & Safety
FW-RISK-02ML model biased against a specific MNO gatewayML fairness34HighMediumTrust & Safety + ML Ops
FW-RISK-03Adversarial homoglyph / encoded-payload bypassSecurity34HighLowSecurity
FW-RISK-04Blocklist-federation source delivers poisoned entriesDependency25HighLowSecurity
FW-RISK-05Audit hash-chain break loses regulator-defensibilityCorrectness25HighLowTrust & Safety
FW-RISK-06Fail-closed Postgres incident causes national SMS outageAvailability25HighMediumSRE
FW-RISK-07Emergency-bypass abuse by privileged insiderInsider15MediumLowSecurity + Legal
FW-RISK-08Fingerprint-storm adversary exhausts cache / rate-limitAdversarial33MediumLowSecurity
FW-RISK-09Cross-region blocklist-state divergenceCorrectness23MediumLowPlatform Arch
FW-RISK-10Rule rollback under pressure breaks enforcementProcess23MediumLowTrust & Safety
FW-RISK-11Legitimate bulk-sender blocked as AIT / SIM-boxML33MediumLowTrust & Safety
FW-RISK-12Federation partner disputes Ghasi blocklist entriesPolitical23MediumMediumRegulator Liaison
FW-RISK-13GDPR subject-access request on historical blockLegal23MediumLowLegal
FW-RISK-14HSM outage pauses outbound federation exportDependency22LowLowSecurity
FW-RISK-15ML model drift causes silent detection-rate degradationML34HighMediumML Ops

2. Risk Details

FW-RISK-01 — False-positive BLOCKs drop OTP

Banking OTP traffic gets false-positive AIT classification during a spike.

Mitigation. Shadow-mode required for rule/model updates; automatic rollback on BLOCK rate > baseline + 50% for 10 min; per-tenant whitelist for design-partner banks; trusted-tenant fast-path (EP-CE-13) bypasses firewall for pre-approved templates; tenant escalation dashboard.

Residual. Medium.


FW-RISK-02 — ML bias against specific MNO

AIT model under-represents one MNO's legitimate bulk traffic.

Mitigation. Fairness audit on model (per-MNO recall/precision; fail CI if disparate recall > 15%); balanced training corpus; per-MNO block-rate monitoring post-launch; human-in-loop on highest-confidence only.

Residual. Medium.


FW-RISK-03 — Adversarial homoglyph / encoded-payload bypass

Attacker obfuscates content to bypass rules.

Mitigation. NFKC + TR39 normalisation at ingest; canonicalisation before match; 500+ homoglyph corpus test in CI; security review on new rule types.

Residual. Low.


FW-RISK-04 — Poisoned federation entries

Compromised source pushes malicious entries.

Mitigation. Federation source auth via HSM-signed mutual certs; anomaly detection (sudden > 1 000 entries triggers review); dual-source corroboration for public-figure / bank-class entries; rollback capability.

Residual. Low.


FW-RISK-05 — Audit hash-chain break

Bug or tamper corrupts the chain.

Mitigation. Daily verifier; canonicalised payload (RFC 8785); two-implementation cross-check; weekly tamper-detection drill.

Residual. Low.


FW-RISK-06 — Fail-closed outage blocks national SMS

Postgres outage takes down firewall.

Mitigation. Postgres HA with auto-failover ≤ 30 s; Redis cache masks 5 min; multi-region fail-over ≤ 15 min; emergency-bypass for P0/P1 only (dual-approval, time-boxed).

Residual. Medium.


FW-RISK-07 — Emergency-bypass insider abuse

Insider engages bypass for personal gain.

Mitigation. Dual-approval (CISO + CTO); time-boxed ≤ 1 h; prominent SIEM audit event; real-time alert to CEO + Board Secretary; quarterly engagement review.

Residual. Low.


FW-RISK-08 — Fingerprint storm

Attacker rotates JA3 fingerprints at extreme rate.

Mitigation. Cloudflare + Kong edge absorbs; LFU cache; tarpit; scale-out; manual edge-filter runbook.

Residual. Low.


FW-RISK-09 — Cross-region divergence

Blocklist state differs between regions.

Mitigation. Logical replication with LWW; hourly reconciliation cron; alert on > 100 rows for 1 h.

Residual. Low.


FW-RISK-10 — Rule rollback breaks enforcement

Protective rule rolled back under pressure; real abuse slips through.

Mitigation. Rule rollback requires reason + ticket; time-boxed (auto-re-enable after 7 d unless replaced); T&S lead sign-off required.

Residual. Low.


FW-RISK-11 — Legitimate bulk-sender flagged

High-volume legitimate use misclassified.

Mitigation. Pre-registered bulk-sender exemption; ML model consumes sender-ID reputation; human-in-loop on high volume; per-tenant tier whitelist.

Residual. Low.


FW-RISK-12 — Federation partner dispute

MNO disputes Ghasi's federation entries.

Mitigation. Per-entry provenance in export; conflict-resolution via Regulator Liaison; MNO attestation in federation agreement.

Residual. Medium.


FW-RISK-13 — GDPR subject-access on historical block

Citizen asks for data held about them.

Mitigation. MSISDN-hash tokenisation in audit; Legal-drafted response format; 30-d SLA.

Residual. Low.


FW-RISK-14 — HSM outage pauses federation export

Export can't be signed.

Mitigation. HSM HA with regional quorum; export queues; backup manual-signing with dual-control.

Residual. Low.


FW-RISK-15 — ML drift

Attacker traffic evolves; model silently loses recall.

Mitigation. Continuous model-accuracy monitoring (held-out test + weekly freshly-labelled corpus); drift alert on F1 drop > 5%; quarterly retraining cadence.

Residual. Medium.


3. Residual-Risk Summary

ResidualCountAcceptance
Low10Accepted for GA
Medium5Accepted with mitigation commitments and named owners
High0

4. Risk Review Cadence

  • Weekly during development (Platform Arch).
  • Monthly post-GA (T&S + SRE + Security).
  • Quarterly (Regulator Liaison + Legal + CTO).