| F1 | Primary provider error spike | aigw_provider_error_total | Slower response; fallback activated | Circuit open → fallback provider; emit provider.degraded | provider-outage.md |
| F2 | All providers down | Both circuits open | 503 AI_PROVIDER_UNAVAILABLE | Page P1; vendor status check; manual failover to on-prem | provider-outage.md |
| F3 | Policy service timeout | aigw_policy_latency_ms p99 breach | 403 fail-closed | Cache short-lived allow decisions off by default; investigate policy service | policy-degraded.md |
| F4 | Moderation classifier offline | Health check red | 422 on all assists or fail-open? Default: fail-closed; emit assist.failed | Run local fallback classifier; escalate | moderation-degraded.md |
| F5 | Redis unavailable | quota calls error | Quota enforced conservatively in memory; degraded | Switch to per-instance quota; Redis recovery; alert | cache-outage.md |
| F6 | Postgres slow / down | integration test + p99 breach | Assist 5xx | HA failover; read-only admin queries | |
| F7 | NATS partition | publish errors | Outbox backs up; assist still succeeds | Outbox relay resumes; DLQ monitored | nats-partition.md |
| F8 | HITL queue backlog | aigw_hitl_queue_depth | Drafts wait; owning module cannot finalise | Notify lead reviewer; escalate to supervisor role; consider auto-reject over N days | hitl-backlog.md |
| F9 | Prompt injection detected | moderation flag | 422 at call site | Block, emit event, log template hash and feature; add corpus sample | |
| F10 | Provider returns PHI leak | post-moderation block | Output suppressed; 200 with null draft | ai.moderation.flagged.v1 stage=output; reviewer manual triage | |
| F11 | Clock skew | provenance requestedAt > completedAt | Invariant violation | Monotonic clock in adapter; health check | |
| F12 | Schema drift (event) | contract test fail | Consumers fail | Roll back schema change; publish .v2 additive | |
| F13 | Consent lookup failure | ABAC deny (consent missing) | 403 AI_CONSENT_REQUIRED | Reviewer checks consent module; restore consent DB | |
| F14 | Quota misconfigured | Sudden 429 spike | Users blocked | Config rollback via config-service; quota override endpoint | |
| F15 | Reviewer over-privileged | manual audit | Accidental acceptance | Quarterly role review; split reviewer/approver when scaled | |
| F16 | Circular saga (assist → finalise → assist) | trace loop detection | Resource exhaustion | Assist denies when X-AI-Originator header present | |
| F17 | Audit publish lost | audit ingestion dedup gap | Compliance risk | DLQ replay; provenance copy on provenance table as source of truth | |
| F18 | KMS outage | encryption failures | Assist fails with null draft persistence | Fail open for metadata only; draft text not stored; emit assist.failed reason KMS_UNAVAILABLE | |