AI Gateway Service — Service Overview
Status: populated
Owner: TBD
Last updated: 2026-04-17
Companion: Service Template · 03 platform-services · 02 DDD
1. Purpose
The AI Gateway is the single controlled ingress for every AI/ML model call made by any Ghasi-eHealth service or client. It centralises policy enforcement, provider routing, moderation, human-in-the-loop (HITL) orchestration, provenance capture, and quota control so that no raw model API key ever reaches a browser, mobile client, or unreviewed service, and so that every clinical AI output can be traced to an auditable decision.
No AI output is allowed to flow to a patient chart, clinical note, order, portal response, or triage suggestion without (a) an AIProvenance record, (b) the policy/consent checks evaluated pre-inference, (c) moderation applied where required, and (d) HITL sign-off when the feature is classified as clinical-decision-adjacent.
2. Bounded context
Bounded context: platform.ai_gateway. This service belongs to the platform plane (not the clinical plane) and sits between any caller (clinical, portal, virtual-care, immunizations, medication, interop, research) and any model provider (Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, on-prem vLLM / Ollama, local medical models).
3. Responsibilities
| # | Responsibility |
|---|
| R1 | Expose one authenticated HTTPS entry (POST /v1/ai/assist, POST /v1/ai/moderate, POST /v1/ai/decisions/:id/review) for every AI use case. |
| R2 | Evaluate access policy, module entitlement, tenant quota, feature flag, and consent (for PHI-touching features) before any inference. Fail closed. |
| R3 | Route the request to the appropriate provider using a provider-selection matrix (tenant config, feature key, residency, DPIA status, fallback). |
| R4 | Apply pre- and post-moderation (safety classifiers, PHI minimisation, prompt-injection defenses). |
| R5 | Orchestrate HITL: persist model output as AIDecision in draft state, notify reviewer queue, accept or reject, emit accepted event with provenance. |
| R6 | Stamp every output with AIProvenance (model, model version, provider, prompt template id, tenant, actor, correlation, timestamps, policy decision id, moderation outcome). |
| R7 | Emit structured events (ai.assist.requested/completed/failed, ai.decision.created/accepted/rejected, ai.moderation.flagged, ai.provider.degraded) for audit and observability. |
| R8 | Enforce per-tenant and per-feature rolling quotas; protect providers with circuit breakers and timeouts. |
| R9 | Redact raw prompts/transcripts from default logs and events (PHI-safe) unless a DPIA-approved retention policy is active. |
4. Non-responsibilities
| # | Not responsible for |
|---|
| NR1 | Persisting accepted AI content into a clinical artifact — the owning module (patient-chart, medication, orders, etc.) accepts the draft and writes the signed record. |
| NR2 | Training or fine-tuning models. Training pipelines live outside the runtime plane. |
| NR3 | Full retrieval-augmented generation (RAG). A thin context-fetch adapter is in scope; a production RAG index is out of scope (tracked in roadmap). |
| NR4 | Standalone ABAC evaluation — delegated to access-policy inside identity-service. |
| NR5 | FHIR resource CRUD — owned by interop-service and clinical services. |
5. Upstream / downstream dependencies
| Direction | Service | Purpose |
|---|
| Upstream (callers) | patient-chart-service | Clinical note summarisation, differential-diagnosis assist |
| Upstream | medication-service | Drug-interaction narrative, med-reconciliation assist |
| Upstream | radiology-service | Imaging pre-read assist (never autonomous) |
| Upstream | laboratory-service | Result narrative / critical-value explanation |
| Upstream | patient-portal-service | Patient-facing triage / symptom-checker assist |
| Upstream | virtual-care-service | Encounter summary, SOAP scaffold |
| Upstream | interop-service | Inbound document classification |
| Upstream | communication-service | Message-draft assistance |
| Downstream | identity-service | JWT validation, access-policy evaluation |
| Downstream | config-service | Feature flags, provider routing matrix, prompt template resolution |
| Downstream | audit-service | Ingests all ai.* events (tamper-evident trail) |
| Downstream | communication-service | Reviewer queue notifications for HITL |
| Downstream | External AI providers | Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, on-prem vLLM, local Ollama |
6. Slice involvement
| Slice | Involvement |
|---|
| S0 (platform foundation) | Mandatory — single AI ingress must exist before any feature uses AI |
| S1 (EHR core) | Required for clinical-notes summarisation, allergies reconciliation hints |
| S2 (portal + virtual care) | Required for triage, encounter summary |
| S3 (population health / research) | Required for cohort-explanation assist; de-identified data plane |
7. Key architectural decisions
| ADR | Decision | Rationale |
|---|
| ADR-AIGW-01 | Single ingress via NestJS 11 + Kong; no browser-side provider keys. | Central policy, cost control, key security. |
| ADR-AIGW-02 | AIDecision aggregate persists draft outputs; owning module accepts to finalise. | Matches clinical-safety guidance: AI output is assistive until signed by clinician or accepted by policy-approved automation. |
| ADR-AIGW-03 | Provider adapters implement a common ModelProvider port; selection by ProviderRoutingRule at runtime. | Supports Anthropic / OpenAI / Azure / on-prem without caller changes. |
| ADR-AIGW-04 | Events carry correlationId and provenanceId; never raw prompt text in default subjects. | FR-AI-006, FR-NFR-018 — PHI-safe eventing. |
| ADR-AIGW-05 | Circuit breaker per provider + per feature; graceful fail-closed when policy or moderation unavailable. | FR-NFR-015, FR-NFR-017. |
| ADR-AIGW-06 | HITL queue backed by AIDecision rows; reviewer actions audited. | Clinical safety, compliance. |
8. System context (mermaid)
9. Canonical flows
| Flow | Trigger | Outcome |
|---|
| Assistive draft | POST /v1/ai/assist from any service | Returns {draftText, isDraft:true, provenance, decisionId} or policy deny |
| HITL review | Reviewer accepts/rejects via POST /v1/ai/decisions/:id/review | Emits ai.decision.accepted consumed by owning module |
| Provider fallback | Primary provider circuit open | Secondary provider used; ai.provider.degraded emitted |
| Moderation block | Pre-moderation flags prompt | 422 returned with AI_MODERATION_BLOCKED; ai.moderation.flagged emitted |
10. Source reconciliation
Legacy _sources/ai-orchestrator/ specified a baseline "assist-only" orchestrator with policy + quota + mock provider. This doc widens the remit to a full gateway: moderation, multi-provider routing, HITL, AIProvenance as a first-class aggregate, and cross-service event coverage. Legacy FR IDs (FR-AI-002..006, FR-NFR-015..018) are preserved and mapped to the new FR-AIGW-* namespace in EPICS.md and USER_STORIES.md. AIO-* prefixes are retained in legacy FR columns for traceability.