Skip to main content

AI Gateway Service — Service Overview

Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · 03 platform-services · 02 DDD

1. Purpose

The AI Gateway is the single controlled ingress for every AI/ML model call made by any Ghasi-eHealth service or client. It centralises policy enforcement, provider routing, moderation, human-in-the-loop (HITL) orchestration, provenance capture, and quota control so that no raw model API key ever reaches a browser, mobile client, or unreviewed service, and so that every clinical AI output can be traced to an auditable decision.

No AI output is allowed to flow to a patient chart, clinical note, order, portal response, or triage suggestion without (a) an AIProvenance record, (b) the policy/consent checks evaluated pre-inference, (c) moderation applied where required, and (d) HITL sign-off when the feature is classified as clinical-decision-adjacent.

2. Bounded context

Bounded context: platform.ai_gateway. This service belongs to the platform plane (not the clinical plane) and sits between any caller (clinical, portal, virtual-care, immunizations, medication, interop, research) and any model provider (Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, on-prem vLLM / Ollama, local medical models).

3. Responsibilities

#Responsibility
R1Expose one authenticated HTTPS entry (POST /v1/ai/assist, POST /v1/ai/moderate, POST /v1/ai/decisions/:id/review) for every AI use case.
R2Evaluate access policy, module entitlement, tenant quota, feature flag, and consent (for PHI-touching features) before any inference. Fail closed.
R3Route the request to the appropriate provider using a provider-selection matrix (tenant config, feature key, residency, DPIA status, fallback).
R4Apply pre- and post-moderation (safety classifiers, PHI minimisation, prompt-injection defenses).
R5Orchestrate HITL: persist model output as AIDecision in draft state, notify reviewer queue, accept or reject, emit accepted event with provenance.
R6Stamp every output with AIProvenance (model, model version, provider, prompt template id, tenant, actor, correlation, timestamps, policy decision id, moderation outcome).
R7Emit structured events (ai.assist.requested/completed/failed, ai.decision.created/accepted/rejected, ai.moderation.flagged, ai.provider.degraded) for audit and observability.
R8Enforce per-tenant and per-feature rolling quotas; protect providers with circuit breakers and timeouts.
R9Redact raw prompts/transcripts from default logs and events (PHI-safe) unless a DPIA-approved retention policy is active.

4. Non-responsibilities

#Not responsible for
NR1Persisting accepted AI content into a clinical artifact — the owning module (patient-chart, medication, orders, etc.) accepts the draft and writes the signed record.
NR2Training or fine-tuning models. Training pipelines live outside the runtime plane.
NR3Full retrieval-augmented generation (RAG). A thin context-fetch adapter is in scope; a production RAG index is out of scope (tracked in roadmap).
NR4Standalone ABAC evaluation — delegated to access-policy inside identity-service.
NR5FHIR resource CRUD — owned by interop-service and clinical services.

5. Upstream / downstream dependencies

DirectionServicePurpose
Upstream (callers)patient-chart-serviceClinical note summarisation, differential-diagnosis assist
Upstreammedication-serviceDrug-interaction narrative, med-reconciliation assist
Upstreamradiology-serviceImaging pre-read assist (never autonomous)
Upstreamlaboratory-serviceResult narrative / critical-value explanation
Upstreampatient-portal-servicePatient-facing triage / symptom-checker assist
Upstreamvirtual-care-serviceEncounter summary, SOAP scaffold
Upstreaminterop-serviceInbound document classification
Upstreamcommunication-serviceMessage-draft assistance
Downstreamidentity-serviceJWT validation, access-policy evaluation
Downstreamconfig-serviceFeature flags, provider routing matrix, prompt template resolution
Downstreamaudit-serviceIngests all ai.* events (tamper-evident trail)
Downstreamcommunication-serviceReviewer queue notifications for HITL
DownstreamExternal AI providersAnthropic, OpenAI, Azure OpenAI, AWS Bedrock, on-prem vLLM, local Ollama

6. Slice involvement

SliceInvolvement
S0 (platform foundation)Mandatory — single AI ingress must exist before any feature uses AI
S1 (EHR core)Required for clinical-notes summarisation, allergies reconciliation hints
S2 (portal + virtual care)Required for triage, encounter summary
S3 (population health / research)Required for cohort-explanation assist; de-identified data plane

7. Key architectural decisions

ADRDecisionRationale
ADR-AIGW-01Single ingress via NestJS 11 + Kong; no browser-side provider keys.Central policy, cost control, key security.
ADR-AIGW-02AIDecision aggregate persists draft outputs; owning module accepts to finalise.Matches clinical-safety guidance: AI output is assistive until signed by clinician or accepted by policy-approved automation.
ADR-AIGW-03Provider adapters implement a common ModelProvider port; selection by ProviderRoutingRule at runtime.Supports Anthropic / OpenAI / Azure / on-prem without caller changes.
ADR-AIGW-04Events carry correlationId and provenanceId; never raw prompt text in default subjects.FR-AI-006, FR-NFR-018 — PHI-safe eventing.
ADR-AIGW-05Circuit breaker per provider + per feature; graceful fail-closed when policy or moderation unavailable.FR-NFR-015, FR-NFR-017.
ADR-AIGW-06HITL queue backed by AIDecision rows; reviewer actions audited.Clinical safety, compliance.

8. System context (mermaid)

9. Canonical flows

FlowTriggerOutcome
Assistive draftPOST /v1/ai/assist from any serviceReturns {draftText, isDraft:true, provenance, decisionId} or policy deny
HITL reviewReviewer accepts/rejects via POST /v1/ai/decisions/:id/reviewEmits ai.decision.accepted consumed by owning module
Provider fallbackPrimary provider circuit openSecondary provider used; ai.provider.degraded emitted
Moderation blockPre-moderation flags prompt422 returned with AI_MODERATION_BLOCKED; ai.moderation.flagged emitted

10. Source reconciliation

Legacy _sources/ai-orchestrator/ specified a baseline "assist-only" orchestrator with policy + quota + mock provider. This doc widens the remit to a full gateway: moderation, multi-provider routing, HITL, AIProvenance as a first-class aggregate, and cross-service event coverage. Legacy FR IDs (FR-AI-002..006, FR-NFR-015..018) are preserved and mapped to the new FR-AIGW-* namespace in EPICS.md and USER_STORIES.md. AIO-* prefixes are retained in legacy FR columns for traceability.