cbc-bridge-service — Service Risk Register
Version: 1.0 Status: Draft Owner: Government / Emergency + Security + SRE + Regulator Liaison Last Updated: 2026-04-21 References: FAILURE_MODES.md, SECURITY_MODEL.md, ADR-0004
Known service-level risks with owners, mitigations, and residual classification. Risk landscape is dominated by political / regulatory dependencies (MNO MoUs, PKI authority, legal frameworks) rather than pure engineering risks. Scored 1–5 Likelihood × Impact; residual must be ≤ Medium for GA.
1. Risk Summary
| ID | Risk | Category | Likelihood | Impact | Pre-mitigation | Residual | Owner |
|---|---|---|---|---|---|---|---|
| CBC-RISK-01 | National-PKI CA not established at launch | Regulatory | 4 | 5 | Critical | Medium | Regulator Liaison + Legal |
| CBC-RISK-02 | MNO CBE endpoints not available or undocumented | Dependency | 4 | 5 | Critical | Medium | MNO Partnerships |
| CBC-RISK-03 | Drill misfires and reaches real subscribers as emergency | Process | 2 | 5 | High | Low | Government / Emergency |
| CBC-RISK-04 | Emergency broadcast bandwidth exceeds MNO CBE capacity | Infra | 2 | 3 | Medium | Low | MNO Partnerships |
| CBC-RISK-05 | Unauthorised broadcast via compromised caller cert | Security | 2 | 5 | High | Low | Security + Legal |
| CBC-RISK-06 | PKI signature bypass via bug in HSM integration | Security | 1 | 5 | High | Low | Security |
| CBC-RISK-07 | Audit chain break loses regulator-defensibility | Correctness | 2 | 5 | High | Low | Government / Emergency |
| CBC-RISK-08 | Cross-region coordination failure during multi-region incident | Operations | 2 | 4 | Medium | Low | SRE |
| CBC-RISK-09 | Translation errors in high-severity broadcast cause panic/misinformation | Correctness | 3 | 5 | High | Medium | Content + Trust & Safety |
| CBC-RISK-10 | Geographic targeting error (broadcast reaches wrong area due to stale cell DB) | Correctness | 2 | 4 | Medium | Low | SRE + MNO Partnerships |
| CBC-RISK-11 | Monthly drill cadence missed → regulator escalation | Process | 2 | 2 | Low | Low | Government / Emergency |
| CBC-RISK-12 | Replay-attack window exploitation | Adversarial | 2 | 4 | Medium | Low | Security |
| CBC-RISK-13 | MNO CBE vendor protocol change breaks adapter | Dependency | 3 | 3 | Medium | Medium | SRE + MNO Partnerships |
| CBC-RISK-14 | HSM unavailability during real emergency | Availability | 2 | 5 | High | Medium | SRE + Security |
| CBC-RISK-15 | Political change in national-PKI authority | Regulatory | 2 | 4 | Medium | Medium | Regulator Liaison |
| CBC-RISK-16 | Legal liability for cross-border cell-broadcast (e.g., near-border MNO cell reaches foreign subscribers) | Legal | 2 | 3 | Medium | Low | Legal |
| CBC-RISK-17 | False-authenticity: citizens lose trust after seeing drill-mistaken-for-emergency or vice-versa | Reputation | 2 | 4 | Medium | Low | Government / Emergency + PR |
2. Risk Details
CBC-RISK-01 — National-PKI CA not established
Scenario. Afghanistan has no formally-recognised National PKI Certification Authority at platform launch. Government clients use ad-hoc self-issued certs.
Impact. No cryptographic basis for "authorised government caller". Verification becomes subject-name-match only, which is forgeable.
Mitigation.
- Phased rollout (Phase 0 engagement with Legal + regulator).
- Interim model: Ghasi operates a Government Trust Anchor whose issuance is dual-controlled by Government Liaison + CISO. Each caller cert signed by this anchor.
- Migrate to formal National PKI when available;
AuthorisedCallertable retains issuer-chain so migration is rebind-not-reissue. - Legal MoU with each government agency defines cert issuance procedure + revocation.
Residual risk. Medium — the interim Government Trust Anchor is defensible but not ideal.
CBC-RISK-02 — MNO CBE endpoints not available
Scenario. Not all Afghan MNOs expose standard 3GPP CBE interfaces; some use vendor-proprietary protocols with non-public documentation.
Impact. Without MNO CBE integration, there's no service.
Mitigation.
- Per-MNO MoU (Phase 0) documents protocol + endpoint + credentials + SLAs.
- Adapter abstraction supports proprietary protocols (Ericsson + Huawei adapters shipped).
- Fallback: high-throughput SMS A2P as interim emergency channel (NOT a CBC substitute, but covers emergency messaging when CBC is unavailable) — designed into
channel-router-service. - Continuous MNO engagement; onboarding playbook per new MNO.
Residual risk. Medium — dependent on MNO cooperation.
CBC-RISK-03 — Drill misfires as real emergency
Scenario. Bug in drill scheduler uses real severity instead of drill. Or admin mis-clicks drill as P0 emergency.
Impact. Real emergency alarm reaches millions of subscribers for a drill — panic, liability.
Mitigation.
- Drill uses CBS test-range Message Identifier (4370..4379 test slot) — physically different from emergency MI (standard 4370/4371/4372); handsets display drill-banner.
- Drill broadcasts include localised "DRILL — NO ACTION REQUIRED" prefix in every language.
is_drill=trueis enforced at the domain layer — CBS encoder refuses to encode a drill without the test MI.- Drill scheduling requires platform-admin role (not caller-initiated).
- Public test channel (
EP-CBC-04US-CBC-017) pre-announces drill schedule.
Residual risk. Low.
CBC-RISK-04 — MNO CBE capacity exceeded
Scenario. During a national emergency (earthquake + multiple simultaneous civil-defence broadcasts), MNO CBE queues overflow.
Impact. Delayed delivery.
Mitigation.
- Per-MNO CBE capacity documented in MoU.
- Platform rate-limits broadcast-submissions per severity; P1/P2 throttled if P0 active.
- Retry logic: queue excess broadcasts for sequential dispatch.
- MNO NOC coordination runbook.
Residual risk. Low.
CBC-RISK-05 — Compromised caller cert
Scenario. Government-client machine compromised; attacker uses legitimate caller cert to send unauthorised broadcast.
Impact. False emergency reaches subscribers.
Mitigation.
- Cert revocation via CRL + OCSP; revoked certs rejected within 4 h (CRL cache) or immediately (OCSP-stapled).
- Dual-control for cancellation of in-flight unauthorised broadcasts.
- Anomaly detection: unusual broadcast from a caller triggers manual review before dispatch.
- Cert rotation every 90 d; long-lived certs not allowed.
- HSM-held private key on government-client side (mandated in cert-issuance procedure).
- SIEM monitoring of caller activity patterns.
Residual risk. Low.
CBC-RISK-06 — PKI bypass via HSM bug
Scenario. Integration bug in @ghasi/hsm-client allows signature verification to return TRUE without actual HSM call.
Impact. Unauthenticated broadcasts accepted.
Mitigation.
- Extensive PKI-bypass adversarial corpus in CI (500+ crafted attacks).
- Two-implementation cross-check: PKI verification has a defence-in-depth second verifier (Node in-process openssl for redundancy) comparing HSM result — divergence blocks dispatch.
- HSM call-level audit (every verify logged with session ID).
- Quarterly security review of HSM integration code.
- Pen test before GA.
Residual risk. Low.
CBC-RISK-07 — Audit chain break
Scenario. Bug or tamper corrupts the hash chain.
Impact. Regulator-defensibility lost for affected period.
Mitigation.
- Daily verifier + tamper-detection drill.
- Canonical JSON (RFC 8785 JCS) eliminates serialisation ambiguity.
- Two independent implementations cross-check.
- Postgres trigger rejects UPDATE/DELETE on
cbc.audit.
Residual risk. Low.
CBC-RISK-08 — Cross-region coordination failure
Scenario. kbl ↔ mzr network partition during emergency; both regions attempt to dispatch same broadcast → duplicate.
Impact. Subscriber receives duplicate emergency; possible confusion.
Mitigation.
- Broadcasts are region-pinned (accepted only in the region where first received).
- Correlation-ID dedup across regions (cached in Redis within each region with cross-region reconciliation).
- MNO CBE adapters deduplicate by Serial Number (per 3GPP TS 23.041).
- Post-partition reconciliation cron.
Residual risk. Low.
CBC-RISK-09 — Translation errors cause panic
Scenario. Pashto / Dari / Arabic translation of a P0 emergency broadcast is mistranslated (e.g., "evacuate" rendered as "remain").
Impact. Loss of life.
Mitigation.
- Pre-approved template library (per
EP-CE-13) — emergency broadcasts use templates only. - Translation review by native speakers (Trust & Safety + Content team) for every template.
- Emergency-broadcast templates reviewed quarterly and re-attested by NDMA.
- Broadcast submission requires translator attestation ID.
- Real-time render preview to government-client before submission.
- Post-broadcast review with NDMA to detect miscommunication.
Residual risk. Medium — natural-language translation will have irreducible risk; mitigation reduces but doesn't eliminate.
CBC-RISK-10 — Geographic targeting error
Scenario. Cell-database stale or incomplete; polygon targeting misses intended area or reaches unintended area.
Impact. Wrong subscribers receive alert; correct subscribers miss.
Mitigation.
- Weekly cell-DB refresh per MNO.
- Cell-DB coverage report + alert when < 95% of national area covered per MNO.
- Named-region (province/district) targeting less error-prone than polygon — preferred for government clients.
- Pre-dispatch preview returns resolved cell count + coverage; government client confirms before proceed.
- Cross-check polygon resolution against multiple MNO cell DBs.
Residual risk. Low.
CBC-RISK-11 — Drill cadence missed
Scenario. Scheduler pod crashes over a month-end weekend.
Impact. Regulator escalation; minor reputation.
Mitigation.
- Scheduler pod has health probe + automatic restart.
CbcDrillOverduealert fires at +7 d past cadence.- Manual drill trigger available to platform admin.
- Quarterly audit of drill history.
Residual risk. Low.
CBC-RISK-12 — Replay attack
Scenario. Attacker captures a legitimate broadcast request + replays it hours later.
Impact. Duplicate broadcast.
Mitigation.
- Signature timestamp window (5 min).
- Nonce per-cert cache (Redis TTL 10 min).
- Correlation-ID uniqueness enforced.
Residual risk. Low.
CBC-RISK-13 — MNO CBE vendor protocol change
Scenario. MNO upgrades CBE; protocol breaks existing adapter.
Impact. That MNO receives no broadcasts until adapter updated.
Mitigation.
- Adapter abstraction → new adapter deployed without full-service redeploy.
- MNO 30-d advance notice of protocol changes (in MoU).
- Standard3gppCbeAdapter as fallback where the MNO also supports standard.
- Continuous-integration against MNO staging endpoints (weekly smoke).
Residual risk. Medium — MNO notice reliability is variable.
CBC-RISK-14 — HSM unavailability during emergency
Scenario. HSM cluster outage coincides with real emergency.
Impact. New broadcasts blocked; existing in-flight complete.
Mitigation.
- HSM HA + regional quorum (ADR-0004 §11).
- Emergency manual-dispatch runbook (CISO + CTO + Government Liaison dual-control) — out-of-band to MNO NOCs.
- Fallback emergency channel via
channel-router-servicehigh-throughput SMS (not cell-broadcast, but reaches subscribers). - HSM fail-over tested quarterly in GameDay.
Residual risk. Medium — low probability × very high impact = accept with prominent mitigation.
CBC-RISK-15 — Political change in national-PKI authority
Scenario. Afghan government re-orgs the agency responsible for PKI; new issuer doesn't honour existing certs.
Impact. All caller certs invalidated; service offline.
Mitigation.
- Trust anchor migration runbook — add new root CA while retaining old for transition period.
- Ghasi Government Trust Anchor remains as the stable intermediate layer (per CBC-RISK-01 mitigation).
- Regulator Liaison tracks upstream political risk quarterly.
Residual risk. Medium.
CBC-RISK-16 — Cross-border cell-broadcast
Scenario. MNO cell on border reaches foreign subscribers. Cross-border broadcast could trigger diplomatic issue.
Impact. Reputation / diplomatic.
Mitigation.
- Polygon validation rejects targets beyond national boundary + 10 km buffer.
- Border cells flagged in
cbc.mno_cell_database; broadcasts honour national-scope restriction. - Legal briefing with foreign affairs Ministry.
Residual risk. Low.
CBC-RISK-17 — False-authenticity erodes public trust
Scenario. Citizens experience drills not-labelled-clearly or emergency broadcasts without apparent official source. Trust in the system erodes.
Impact. Reduced effectiveness of future emergency broadcasts.
Mitigation.
- Drill labelling is mandatory + enforced at domain level.
- Every broadcast includes official issuer in body (e.g., "— NDMA" suffix).
- Public awareness campaign before Phase 2 (MedComms + PR).
- Public test channel exposes drill history for citizen verification.
- Annual citizen-survey on trust level.
Residual risk. Low.
3. Residual-Risk Summary
| Residual | Count | Acceptance |
|---|---|---|
| Low | 11 | Accepted for GA |
| Medium | 6 | Accepted with mitigation commitments and named owners |
| High | 0 | — |
4. Risk Review Cadence
- Weekly during development (Platform Architecture).
- Monthly post-GA (Government / Emergency + SRE + Security).
- Quarterly (Regulator Liaison + Legal + CTO + CISO) — includes political-risk review (CBC-RISK-01, CBC-RISK-15).
- Annual (CEO-chaired) — regulator / MNO partnership posture.