Skip to main content

Immunizations Service — Failure Modes

Status: populated Owner: TBD Last updated: 2026-04-18 Companion: Service Template

Failure Mode Register

IDFailureDetectionMitigationRecovery
FM-IMM-01PostgreSQL unavailableReadiness probe fails; 503 on all requestsConnection pool retry (3×, exponential backoff); health probe gates trafficRestore DB; pod restarts; connection pool reconnects automatically
FM-IMM-02NATS JetStream unavailableOutbox relay logs errors; IMMUNIZATIONS_OUTBOX_DEPTH_HIGH alert firesOutbox pattern buffers events in DB; no event loss during NATS downtimeRestore NATS; outbox relay drains backlog
FM-IMM-03Redis unavailableBullMQ logs connection errors; forecast refresh queue stallsForecast refresh falls back to synchronous execution for new records (degraded mode)Restore Redis; queue workers reconnect; backlog drains
FM-IMM-04EPI schedule service unavailableGET /health/startup fails if schedule not loaded; forecast refresh errorsEPI schedule cached in-memory at startup; TTL 24 hoursServe from cache; alert if cache TTL expires without refresh
FM-IMM-05Forecast refresh worker crashBullMQ job stuck in active state; FORECAST_STALENESS alert firesJob TTL causes re-queue after 60s; max 3 retriesRestart worker pod; BullMQ retries incomplete jobs
FM-IMM-06National registry unreachableRegistry sync job status failed after retries; alert firesSync retried with exponential backoff (max 5 retries, ~4h total); local recording unaffectedRegistry restored; manual sync trigger available via admin API
FM-IMM-07Outbox relay stuckIMMUNIZATIONS_OUTBOX_DEPTH_HIGH alert; events not reaching downstreamAt-least-once relay; relay worker restart resolves; event consumers are idempotentRestart relay worker; events delivered at-least-once
FM-IMM-08Patient deceased flag not receivedForecast still generated for deceased patientSubscribe to REGISTRATION.patient.vital-status-changed; flag processed asynchronouslyOnce event arrives, forecast suppressed and defaulter outreach halted
FM-IMM-09Duplicate immunization recordTwo offline devices record same dose for same patientclientMutationId deduplication; second submission returns 409Vaccination officer reviews; correction endpoint available
FM-IMM-10Wrong patient assigned to recordClinical error during recordingCorrection endpoint (PUT /v1/immunizations/:id/correction) marks original entered-in-error; new record created for correct patientAudit trail preserved; clinical review required
FM-IMM-11Coverage materialized view staleCoverage dashboard shows old dataView refreshed hourly by cron and event-driven on record creationCron catchup; manual REFRESH MATERIALIZED VIEW CONCURRENTLY available
FM-IMM-12Contraindication check bypassedOverridden without CLINICIAN roleRole guard enforced on override field; audit log captures bypass attemptRevoke token; review audit log; correct record if needed