Interop Service — Failure Modes
Status: populated Owner: TBD Last updated: 2026-04-18 Companion: Service Template · 03 platform-services · 02 DDD
1. Failure Catalog
| # | Failure | User impact | Detection | Mitigation |
|---|---|---|---|---|
| F-01 | Owning service unreachable (FHIR routing) | FHIR reads/writes fail for that resource type | 5xx spike per owning service; alert P2 | Return 503 OperationOutcome; circuit breaker per service; alert SRE |
| F-02 | PostgreSQL unavailable | All API calls fail; HL7 ACK not sent (messages lost if MLLP connection drops) | Health probe fails | Failover replica; MLLP reconnect after DB recovery; alert P1 |
| F-03 | NATS unavailable | Outbound HL7 events not triggered; interop events not published | Outbox lag alert | Outbox accumulates; events replayed on reconnect |
| F-04 | MLLP client cert expired (outbound) | Outbound HL7 v2 messages rejected by external system | TLS handshake error; connector error rate spike | Pre-rotation alert 30 days before expiry; hot-swap cert via connector update |
| F-05 | HL7 v2 parse error (unknown segment) | Message stored; not processed; sent to DLQ | interop_hl7_dead_lettered_total | Store raw; alert integration admin; manual reprocess after mapping fix |
| F-06 | ABAC service unavailable | Patient-linked FHIR reads blocked | abac_check_failures spike; alert | Circuit breaker; deny by default when ABAC unavailable (fail-safe) |
| F-07 | Bulk export partial failure | Export NDJSON incomplete | Job status partial; error manifest | Client re-triggers export; partial files flagged in manifest |
| F-08 | Profile validation blocking valid resource | External partner writes rejected unexpectedly | interop_profile_validation_failures | Configurable validation mode: error (block) vs warn (log only); admin override |
| F-09 | Redis cache miss (CapabilityStatement) | CapabilityStatement regenerated on every request | Latency spike on GET /metadata | Fall back to in-memory cached value; rebuild from routing table |
| F-10 | Duplicate connector port conflict | MLLP listener fails to start | Startup error log | Unique port check on connector activation; error returned to admin |