Scheduling Service — Failure Modes
Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template
1. Failure Catalog
| ID | Failure | User Impact | Detection | Mitigation |
|---|---|---|---|---|
| FM-SCHED-01 | PostgreSQL unavailable | All scheduling operations fail | Healthcheck readiness probe; alert | Replica promotion; retry on reconnect |
| FM-SCHED-02 | NATS unavailable | Bookings persist but no events published | Outbox lag metric alert | Outbox relay retries when NATS recovers |
| FM-SCHED-03 | Slot reservation race condition | Two users book same slot simultaneously | Unique index constraint violation → one gets 409 | Atomic slot status update via DB transaction + unique index |
| FM-SCHED-04 | Reminder dispatch failure | Patient misses appointment reminder | Reminder failure rate alert | Async retry up to 3×; dead-letter queue; patient can check portal |
| FM-SCHED-05 | registration-service events delayed | Deceased patient appointments not auto-cancelled promptly | Monitoring on event lag | Eventual consistency; staff UI shows deceased flag; manual cancellation |
| FM-SCHED-06 | HL7 SIU inbound parse failure | External system appointment not reflected | Dead-letter queue alert | Dead-letter queue; manual reprocessing runbook |
| FM-SCHED-07 | Clock skew causing timezone rendering error | Appointments displayed at wrong time | User report; monitoring on slot/appointment time deltas | Timezone stored as IANA string; all timestamps UTC; client renders using schedule timezone |
| FM-SCHED-08 | Waitlist auto-fill event published but patient not reachable | Patient misses waitlist offer | Reminder dispatch failure metric | Retry reminder; SLA window for patient response before auto-expiry |
| FM-SCHED-09 | Redis unavailable (reminder queue) | Scheduled reminders not dispatched | Redis health alert | Reminders delayed; catch-up dispatch on Redis restore |