Skip to main content

Medication Service — Failure Modes

Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template

1. Catalog

IDFailureUser impactDetectionMitigation
MED-F-001Drug KB unavailableSigning blocked (safety-critical)medication_kb_check_total{outcome=error} > 2%503 returned; fallback to cached formulary for non-safety-critical checks; queue drafts; auto-resume on KB heartbeat
MED-F-002Postgres write latency spikeSign/dispense slow or 5xxp95 > 3sConnection pool tuning; shed load via Kong rate-limit; failover to standby
MED-F-003Inventory decrement conflict (two dispenses same lot)One fails with 409Metric medication_dispense_total{outcome=conflict}Version-lock retry once; if still conflict, user picks different lot
MED-F-004Negative stock observedData integrity issueMonitor stock_items.quantity_on_hand < 0P1 alert; transaction review; likely replay bug — consumer isolation audit
MED-F-005Outbox stallDownstream (billing, gateway) out of syncmedication_outbox_undelivered growingRestart relay; check NATS connectivity; manual replay via outbox.replay CLI
MED-F-006Gateway POST MR/MD failureExternal pharmacies don't receive Rx/dispenseHTTP 5xx from gateway, outbox retries increasingExponential backoff (max 5 retries, 2h ceiling); DLQ after; pharmacist notified to transmit manually
MED-F-007Gateway inbound event duplicateRisk of duplicate dispensesInbox dedup metricInbox table prevents; alert if duplicate count > baseline
MED-F-008NCPDP SCRIPT endpoint down (optional)External e-Rx failsAdapter error rateRetained in adapter queue; 3 retries + prescriber notification
MED-F-009Counter-sign bottleneck (one licensed pharmacist)CS dispense delayedDispense queue isControlled=true ageCoverage rota alert; supervisor escalation policy
MED-F-010Offline portal 4h limit exceededQueue becomes staleClient telemetry offline durationBanner warning at 3h; reject new dispense entries at 4h; user must reconnect
MED-F-011Sync conflict on dispense (offline → online)Dispense rejected if stock unavailableClient conflict eventServer-wins; user requeues; idempotency prevents duplicate
MED-F-012RxNorm/terminology lookup failureDrafting blocked for coded medTerm service 5xx rateAllow free-text drug name with requiresReview=true flag per BR-MEDS
MED-F-013Controlled-substance MFA step-up failurePrescriber cannot sign CSIdentity MFA challengeFallback: prescriber re-auths; retry sign; audit every failed step-up
MED-F-014Mass recall eventSpike in recall-triggered + dispense blocksmedication_inventory_recall_totalBatch-process recall lot list; block dispense; notify affected patients via communication-service
MED-F-015Drug-KB snapshot mismatch during auditOverride record references missing versionCI auditRetain KB snapshots indefinitely in S3; restore-on-demand
MED-F-016Gateway MR ingestion lag > 30sPharmacy queue staleConsumer lagScale consumer replicas; investigate NATS stream partition

2. Safety-Critical Fail-Closed Behaviors

  • Sign of prescription never proceeds without KB check AND allergy check verification success or a recorded override. If checks time out → sign refused with 503 + retry guidance.
  • Dispense never against expired lot or recalled lot — system blocks even with pharmacist override intent (only emergency override is for insufficient-stock, never for expired/recalled).
  • Schedule II dispense never without counter-sign.

3. Non-Safety-Critical Degrade-Open Behaviors

  • Reorder alert generation may lag up to 5 minutes with no patient impact.
  • Expiry alert batch may run once every 6h.
  • AI-advisory features disable silently when ai-gateway-service unavailable.