Facility Service — Risk Register
Status: populated Owner: TBD Last updated: 2026-04-17 Companion: SERVICE_TEMPLATE §16
1. Register
| ID | Risk | Likelihood | Impact | Owner | Mitigation | Residual |
|---|---|---|---|---|---|---|
| R1 | Cycle-detection regression on deep graphs causes slow cycle-check and stale cache | Medium | High | Facility lead | Recursive-CTE micro-benchmark in CI; perf regression gate; depth cap of 15; alerts | Low |
| R2 | Bed status race allows two patients on one bed | Low | Critical (clinical safety) | Clinical ops + Facility | Serializable TX, domain invariant, E2E test, incident drill | Very low |
| R3 | Profile schema drift invalidates existing nodes mid-operations | Low | High | Platform admin | Profile updates non-retroactive; dry-run; review workflow | Low |
| R4 | Context cache divergence from DB after NATS outage | Medium | Medium | SRE + Facility | TTL cap 5min; event-driven invalidation; cache recomputation on miss | Low |
| R5 | Tenant cross-leak via bug in RLS or misset app.tenant_id | Low | Critical | Security | Mandatory tenant-isolation integration test; property-based tests; DB role restrictions | Very low |
| R6 | Large subtree response (>10 MB) causes OOM on edge clients | Low | Medium | Facility | maxDepth default; pagination on subtree; response size cap | Low |
| R7 | Cross-region federation decisions delayed — tenants need shared regional nodes earlier than M3 | Medium | Medium | Product | Document workaround (per-tenant sub-hierarchy); schedule architecture spike | Medium |
| R8 | FHIR projection lag causes interop partners to see stale data | Medium | Medium | Interop | Outbox + retry; health-check SLO; manual replay runbook | Low |
| R9 | Licensing / access-policy coupling introduces hot-path latency | Medium | High | Platform | Timeout budgets; circuit breaker; cache access decisions within request | Medium |
| R10 | Import of malformed facility definition corrupts tenant state | Low | High | Facility | Mandatory dry-run; schema validation; transactional apply | Very low |
| R11 | Bed housekeeping state machine drifts across services (inpatient, housekeeping, scheduling) | Medium | Medium | Domain | State machine ownership lives only in facility-service; others emit commands | Low |
| R12 | DAG depth exceeded by a novel deployment (e.g., 20-level federated hierarchy) | Low | Medium | Product | Configurable maxDepth; performance budget | Low |
| R13 | Post-merge module-mapping incorrectly routes legacy FR refs | Low | Low | Docs | Legacy-ref column in EPICS/USER_STORIES; migration-plan mapping | Very low |
2. Review cadence
- Weekly during M0–M1.
- Monthly steady-state.
- On every major incident, residual risk re-evaluated.