Skip to main content

Virtual Care Service — Failure Modes

Status: populated Owner: TBD Last updated: 2026-04-18 Companion: Service Template · 03 platform-services

1. Failure Catalog

#FailureUser / System ImpactDetectionMitigation
F01Video backend unreachable at session createSession creation fails; clinician cannot start visitHTTP 503 from Jitsi health check adapter; VIDEO_PROVIDER_UNAVAILABLENo session row created; return 503 with diagnostic; clinician may use async visit fallback; ops alert via vcare_backend_health_check_success
F02Video backend disconnects mid-session (Jitsi server crash)Participants lose video/audioParticipant disconnect webhook or polling; session status monitor60s grace reconnect window; if reconnect fails → transition to failed → notify participants → InitiateFallbackUseCase opens messaging thread
F03Join token used after expiryParticipant cannot join401 TOKEN_EXPIRED from token validatorReturn error with link to refresh token; participant calls /join-token for new token
F04Optimistic lock conflict on session mutationSecond client gets 409409 OPTIMISTIC_LOCK_CONFLICT with currentVersionClient refetches session and retries; UI handles gracefully
F05FHIR Gateway unavailable at session endEncounter not created; billing event delayedHTTP 5xx from FHIR gateway adapterSession marked ended with encounterId: null; retry job reconciles within 5 min; alert if unreconciled after 30 min
F06Recording reference fails to store (Jibri failure)Recording not available post-sessionJibri webhook failure; recordingRef: nullSession still ends normally; recordingRef null; audit warning; do not block session end
F07Consent service unavailable at session createSession creation blocked (fail-closed)HTTP 5xx from access-policy adapterReturn 503 CONSENT_SERVICE_UNAVAILABLE; do not bypass consent gate; ops alert
F08Double-end call (race condition)Concurrent end requestsFSM idempotency on ended stateSecond end call on already-ended session returns 200 with current state (idempotent)
F09Patient network drop during session (intermittent Afghan 3G)Patient disconnects; session continues for providerParticipant disconnected status via backend webhookGrace reconnect window applies; if patient reconnects within 60s, session continues; if not, provider can admit again from waiting room or initiate fallback
F10External provider quota exceeded (Zoom/Webex)Session creation fails for premium backendsHTTP 503 PROVIDER_QUOTA_EXCEEDEDReturn 503; if fallbackVideoBackend configured, auto-retry with secondary; alert ops
F11Async visit duplicate submission (offline client retries)Potential duplicate recordsclientMutationId unique constraintDB unique constraint on (tenant_id, client_mutation_id) returns original record; idempotent
F12NATS JetStream unavailableEvents not published; downstream services miss lifecycle eventsOutbox lag alert (> 5 min)Transactional outbox accumulates events; redelivered when NATS recovers; no event loss
F13KMS unavailable (cannot sign join token)Join token issuance failsKMS API timeoutReturn 503; session exists but participants cannot join; ops alert; KMS HA required
F14AI gateway unavailable during sessionSTT and summary drafts unavailableHTTP 5xx from ai-gatewayDegrade gracefully: notify clinician; session continues; manual documentation available

2. Degraded Mode Behavior

ConditionDegraded Behavior
Video backend unhealthyNo new sessions created; existing sessions continue until backend recovers or fail
FHIR Gateway slowSession ends normally; Encounter creation retried asynchronously
AI gateway offlineSTT/summary disabled; session proceeds with manual documentation
NATS unavailableOutbox accumulates; reads/writes continue; downstream consumers delayed
Offline (async visit)Async visit content queued on device; submitted when connectivity restored

3. Bandwidth Fallback Thresholds (Configurable)

TierTrigger conditionAction
Video → Audio-onlyEstimated bandwidth < 512 kbps sustainedSwitch to audio-only mode in Jitsi
Audio-only → Async textNo stable audio for 30s OR provider manually initiates fallbackInitiateFallbackUseCase; open messaging thread; emit fallback.initiated event