Skip to main content

Provider Directory Service — Failure Modes

Status: populated Owner: TBD Last updated: 2026-04-17

1. Catalog

#FailureUser impactDetectionMitigation
F1Postgres writer outageWrites fail; reads degraded/readyzFailover; retry on client
F2OpenSearch outageFuzzy search degraded to prefix matchHealth metricFallback to DB search; warn ops
F3Redis outagePrivilege checks hit DBCache miss 100%Circuit break; reduce non-critical reads
F4NATS outageOutbox backlogoutbox_lag_secondsWait, then replay
F5Credential expiry job silently failsProviders continue with expired licenseJob heartbeat missingHeartbeat metric + alert; rerun manually
F6Endpoint healthcheck stuckPartners see no updatesSuccess rate plummetsReplace probe pod; circuit-break caller
F7Privilege cascade bug — revoked credential does not demote rolesClinical safetyIntegration test + audit replayTransactional cascade + regression test
F8Cross-tenant leak via searchSecurity incidentMandatory isolation testRLS; tenant-scoped index; alert on drift
F9FHIR projection lagPartner stalenessProjection error rateDLQ + manual replay
F10Duplicate creation race (same identifier)Data-quality issueUnique constraint violationDB constraint; 409 to second caller
F11Credential PDF upload triggers PII retention policyLegalReview hookSeparate document-service bucket with encryption
F12Expired credential brown-out (clock skew)False expired stateNTP alertsNTP sync; tolerance of 60s skew in scheduler
F13Terminology-service outage blocks writesOnboarding haltedTimeout metricsSoft-fail: accept with specialty.unverified flag; re-validate later
F14Endpoint with mtls auth has expired certPartner auth failsendpoint.health_changed.v1 to errorAlert on error transition; runbook

2. Blast Radius

Dep downReadsWritesDownstream
PostgresdegradedfailAll
OpenSearchdegraded (fallback)okSearch slow
RedisokokPrivilege check slower
NATSokokDownstream staleness
Terminologyoksoft-failNew practitioner has unverified specialties
Access-policyokfailWrites blocked