numbering-service — Service Risk Register
Version: 1.0
Status: Draft
Owner: Commerce Engineering + Platform SRE + Legal
Last Updated: 2026-04-21
Companion: FAILURE_MODES · SECURITY_MODEL · SERVICE_READINESS
Risk Scoring
| Likelihood × Impact | Low | Medium | High | Critical |
|---|
| Low | LOW | LOW | MEDIUM | HIGH |
| Medium | LOW | MEDIUM | HIGH | HIGH |
| High | MEDIUM | HIGH | HIGH | CRITICAL |
1. Operational Risks
R-OPS-01 — MNO contract lapse causes assignment loss
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | High |
| Rating | HIGH |
| Description | If a Roshan / Etisalat-AF / MTN-AF / AWCC / Salaam contract for a prefix range expires without renewal, all MSISDNs in that block become un-assignable; existing leases honour their validUntil only as far as the contract permits. |
| Mitigation | (1) Daily contract-expiry alerts at 60 d / 30 d / 7 d before effective_until; (2) Commerce ops engagement starts 90 d ahead per playbook; (3) Multiple MNO relationships diversify risk; (4) MoU includes auto-renewal clause where MNO permits; (5) numbering.lease_contracts.status = EXPIRING flag surfaces in admin dashboard |
| Owner | Commerce ops + Legal |
| Review | Monthly |
R-OPS-02 — Short-code scarcity (national exhaustion)
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | High |
| Rating | HIGH |
| Description | ATRA-allocated short-code inventory is finite. As tenants demand 4-digit and 5-digit codes, scarcity rises. New ATRA allocations have multi-week lead times. |
| Mitigation | (1) NumberingShortCodeScarcityCritical alert at < 10 % platform-wide; (2) ATRA allocation requests filed quarterly per quota; (3) Vanity-tier pricing dampens demand; (4) Capacity plan reviewed in monthly commerce ops |
| Owner | Commerce ops |
| Review | Monthly |
R-OPS-03 — Cross-tenant number-claim conflict
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | Medium |
| Rating | MEDIUM |
| Description | Two tenants race to lease the same AVAILABLE identifier. The CAS mechanism prevents incorrect outcome but high conflict rates degrade UX. |
| Mitigation | (1) Partial unique index + CAS guarantees correctness; (2) Customer portal shows live-updating availability; (3) Pessimistic in-portal "browse-then-lock" UI flow uses Reserve TTL; (4) Conflict rate visible on dashboard |
| Owner | Commerce Eng |
| Review | Quarterly |
R-OPS-04 — Reservation TTL imprecision under Redis outage
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | Medium |
| Rating | MEDIUM |
| Description | If Redis keyspace notifications are dropped, reservation cleanup is delayed up to 60 s (safety-net cron interval). Tenants may see stale "unavailable" entries. |
| Mitigation | (1) Safety-net cron at 60 s; (2) PG expires_at is the source of truth; (3) Browse endpoint joins to PG, never to Redis-only state; (4) Alert if cleanup lag > 5 s P95 |
| Owner | Platform SRE |
| Review | Quarterly |
R-OPS-05 — Quarantine cool-off vs. tenant urgency trade-off
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | Medium |
| Rating | MEDIUM |
| Description | A tenant with a legitimate need to re-lease a recalled identifier (e.g., accidentally cancelled) cannot do so during 90-d MSISDN quarantine. Tension between abuse-prevention and tenant flexibility. |
| Mitigation | (1) Admin-override path with mandatory 20-char justification; (2) Override audit-logged; (3) Daily report of overrides surfaced in compliance dashboard (R-SEC-04 prevents abuse); (4) Tenant T&Cs disclose cool-off semantics upfront |
| Owner | Commerce ops + Legal |
| Review | Quarterly |
R-OPS-06 — Multi-region split-brain (CAS divergence)
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | High |
| Rating | MEDIUM |
| Description | Network partition between kbl and mzr could in theory allow concurrent writes that bypass CAS quorum. |
| Mitigation | (1) Synchronous quorum on numbers/leases via Patroni; (2) Nightly reconciliation cron detects divergences and emits number.conflict.detected.v1; (3) ADR-0004 §14 explicitly chooses safety over availability — service degrades to read-only on partition |
| Owner | Platform DBA + SRE |
| Review | Quarterly |
2. Security Risks
R-SEC-01 — Compromised admin tier-overrides quotas
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | High |
| Rating | MEDIUM |
| Description | An attacker with platform.numbering.admin could uplift a tenant's quotas, allowing them to corner the inventory or impersonate brands via alpha-IDs. |
| Mitigation | (1) MFA enforced for platform admin accounts; (2) All admin actions audit-logged with hash-chain; (3) Audit events replicated to SIEM; (4) Anomaly detection on quota changes (fraud-intel); (5) Two-person rule planned for Phase 4 |
| Owner | Security |
| Review | Quarterly |
R-SEC-02 — Forged MNO CSV import (compromised signing key)
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | Critical |
| Rating | HIGH |
| Description | If an MNO's RSA signing key is compromised, an attacker could submit a malicious CSV granting them control of inventory blocks. |
| Mitigation | (1) Signing keys rotated quarterly with grace overlap; (2) Each import logs file_sha256 — duplicates flagged; (3) Out-of-band confirmation via MNO commercial team for blocks > 10 k MSISDNs; (4) Block ingest CFO sign-off for ranges > 100 k; (5) Vault stores public keys only — private keys stay with MNO |
| Owner | Security + Commerce ops |
| Review | Quarterly |
R-SEC-03 — Audit hash-chain tampering
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | Critical |
| Rating | HIGH |
| Description | DB-level tampering (e.g., via DBA tool) could attempt to alter the audit chain, hiding evidence from ATRA. |
| Mitigation | (1) Postgres rules reject UPDATE / DELETE on numbering.audit; (2) Daily audit-chain-verify cron raises CRITICAL alert + halts regulator export on break; (3) NATS numbering.audit.v1 mirror is replicated to analytics-service for independent copy; (4) Cold archive to S3 object-lock (WORM) for 7 y |
| Owner | Security + Compliance |
| Review | Monthly |
R-SEC-04 — Phisher rotation via recall-and-re-lease
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | High |
| Rating | HIGH |
| Description | A bad actor recalls their own number and immediately re-leases it under a new tenant identity to evade reputation tracking. |
| Mitigation | (1) Quarantine cool-off (90 d MSISDN); (2) Same tenant cannot re-lease during cool-off (no self-bypass); (3) Admin override requires justification + is rate-monitored (anomaly signal); (4) Alpha-ID has platform-wide uniqueness — recalled alpha cannot be re-leased to a different tenant during the cool-off window even though cool-off is 0 d (the value is reserved by the inventory ledger for 90 d on REVOKED) |
| Owner | Trust & Safety + Commerce |
| Review | Monthly |
R-SEC-05 — Tenant Reserve flood (inventory scrape / DoS)
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | Medium |
| Rating | MEDIUM |
| Description | A malicious tenant scripts thousands of Reserves to scrape inventory or block other tenants. |
| Mitigation | (1) maxActiveReservations quota; (2) Per-tenant Reserve rate limit (60/min); (3) RESERVATION_BURST signal → fraud-intel → tenant tier downgrade; (4) Auto-release TTL prevents permanent blocking |
| Owner | Trust & Safety + Platform Eng |
| Review | Quarterly |
R-SEC-06 — Cross-tenant data leak via API enumeration
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | Medium |
| Rating | MEDIUM |
| Description | A tenant could attempt to enumerate other tenants' leases by guessing identifiers and observing ValidateLease responses. |
| Mitigation | (1) ValidateLease cross-tenant returns WRONG_TENANT not detail; (2) Negative-cache with 30 s TTL absorbs enumeration; (3) Tenant portal does not expose other tenants' identifiers via browse (filtered to AVAILABLE); (4) gRPC mTLS limits direct ValidateLease to internal services |
| Owner | Security |
| Review | Quarterly |
3. Compliance / Regulatory Risks
R-REG-01 — Regulator-export gaps cause ATRA non-compliance
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | High |
| Rating | MEDIUM |
| Description | Missing or rejected monthly export to ATRA could result in regulatory action against the platform. |
| Mitigation | (1) Cron at 01:00 UTC on the 1st with multiple alerting layers; (2) Manual generation endpoint as fallback; (3) Status tracking from PENDING → ACCEPTED with notification; (4) ATRA submission SOP owned by Legal + Compliance |
| Owner | Legal + Commerce ops |
| Review | Monthly |
R-REG-02 — Quarantine policy challenged by tenant SLAs
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | Medium |
| Rating | LOW |
| Description | An enterprise tenant with brand-criticality may demand shorter cool-off, conflicting with platform abuse-prevention policy. |
| Mitigation | (1) T&Cs disclose cool-off semantics; (2) Admin override path exists with audit; (3) Vanity-tier purchases shift renewal dynamics (less recall churn); (4) Annual policy review with ATRA |
| Owner | Legal + Product |
| Review | Annual |
R-REG-03 — Cross-border data flow for multi-region
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | Medium |
| Rating | LOW |
| Description | If mzr region were ever placed outside Afghan jurisdiction, sovereignty rules could be breached. |
| Mitigation | (1) Both kbl and mzr are within Afghanistan; (2) ADR-0004 §14 explicitly mandates in-country regions; (3) S3 buckets region-pinned to AF data centres |
| Owner | Legal + Platform SRE |
| Review | Annual |
R-REG-04 — Numbering-plan ATRA reallocation
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | High |
| Rating | MEDIUM |
| Description | ATRA may reissue or reshuffle MSISDN ranges (Number Portability changes, MNC consolidation). Existing leases may need migration. |
| Mitigation | (1) ATRA liaison participates in quarterly planning calls; (2) LeaseContract has effective_until; (3) Migration tooling in roadmap (Phase 4) for bulk renumbering scenarios |
| Owner | Legal + Commerce ops |
| Review | Quarterly |
4. Product / Business Risks
R-BUS-01 — Vanity short-code monopolisation
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | Medium |
| Rating | MEDIUM |
| Description | A single tenant could reserve all premium vanity codes for brand defence, blocking competitors. |
| Mitigation | (1) Per-tenant maxLeasedShortCode quota; (2) Vanity-tier pricing premium; (3) ATRA could intervene if monopoly emerges; (4) Auto-recall on non-payment (vanity has 14-d grace) |
| Owner | Commerce + Legal |
| Review | Quarterly |
R-BUS-02 — False-positive QUARANTINE_ACTIVE damages tenant trust
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | Medium |
| Rating | LOW |
| Description | Tenants expecting an immediately recyclable number get blocked, leading to support tickets. |
| Mitigation | (1) UX surfaces availableAt in error response; (2) Customer portal shows quarantine timeline visually; (3) Admin override path for legitimate cases; (4) Tenant T&Cs and onboarding clearly explain |
| Owner | Product |
| Review | Quarterly |
R-BUS-03 — Reservation-floods cause inventory-scarcity perception
| Attribute | Value |
|---|
| Likelihood | Medium |
| Impact | Low |
| Rating | LOW |
| Description | Honest tenants browsing concurrently see same numbers reserved-by-others, creating perception of low availability. |
| Mitigation | (1) Reservations TTL (15 min) caps damage; (2) Browse endpoint surfaces "reserved by other tenant; available again at HH:MM"; (3) Pool capacity dashboard for ops visibility |
| Owner | Product + Commerce ops |
| Review | Quarterly |
R-BUS-04 — MNO-renumbering migration disruption
| Attribute | Value |
|---|
| Likelihood | Low |
| Impact | High |
| Rating | MEDIUM |
| Description | An MNO consolidates / renumbers prefixes (e.g., MTN-AF MNC change), forcing re-mapping. |
| Mitigation | (1) Migration tooling planned (Phase 4); (2) MNO MoUs include 90-d advance notice for renumbering; (3) Tenant-facing preserved-identifier abstraction (alpha-ID stays identical) |
| Owner | Commerce ops + Platform Eng |
| Review | Annual |
5. Risk Summary Matrix
| Risk ID | Rating | Owner | Review |
|---|
| R-OPS-01 | HIGH | Commerce ops + Legal | Monthly |
| R-OPS-02 | HIGH | Commerce ops | Monthly |
| R-OPS-03 | MEDIUM | Commerce Eng | Quarterly |
| R-OPS-04 | MEDIUM | Platform SRE | Quarterly |
| R-OPS-05 | MEDIUM | Commerce ops + Legal | Quarterly |
| R-OPS-06 | MEDIUM | Platform DBA + SRE | Quarterly |
| R-SEC-01 | MEDIUM | Security | Quarterly |
| R-SEC-02 | HIGH | Security + Commerce ops | Quarterly |
| R-SEC-03 | HIGH | Security + Compliance | Monthly |
| R-SEC-04 | HIGH | T&S + Commerce | Monthly |
| R-SEC-05 | MEDIUM | T&S + Platform Eng | Quarterly |
| R-SEC-06 | MEDIUM | Security | Quarterly |
| R-REG-01 | MEDIUM | Legal + Commerce ops | Monthly |
| R-REG-02 | LOW | Legal + Product | Annual |
| R-REG-03 | LOW | Legal + SRE | Annual |
| R-REG-04 | MEDIUM | Legal + Commerce ops | Quarterly |
| R-BUS-01 | MEDIUM | Commerce + Legal | Quarterly |
| R-BUS-02 | LOW | Product | Quarterly |
| R-BUS-03 | LOW | Product + Commerce ops | Quarterly |
| R-BUS-04 | MEDIUM | Commerce ops + Platform Eng | Annual |
End of SERVICE_RISK_REGISTER.md