AI Gateway Service — Service Overview

Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · 03 platform-services · 02 DDD

1. Purpose

The AI Gateway is the single controlled ingress for every AI/ML model call made by any Ghasi-eHealth service or client. It centralises policy enforcement, provider routing, moderation, human-in-the-loop (HITL) orchestration, provenance capture, and quota control so that no raw model API key ever reaches a browser, mobile client, or unreviewed service, and so that every clinical AI output can be traced to an auditable decision.

No AI output is allowed to flow to a patient chart, clinical note, order, portal response, or triage suggestion without (a) an AIProvenance record, (b) the policy/consent checks evaluated pre-inference, (c) moderation applied where required, and (d) HITL sign-off when the feature is classified as clinical-decision-adjacent.

2. Bounded context

Bounded context: platform.ai_gateway. This service belongs to the platform plane (not the clinical plane) and sits between any caller (clinical, portal, virtual-care, immunizations, medication, interop, research) and any model provider (Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, on-prem vLLM / Ollama, local medical models).

3. Responsibilities

#	Responsibility
R1	Expose one authenticated HTTPS entry (`POST /v1/ai/assist`, `POST /v1/ai/moderate`, `POST /v1/ai/decisions/:id/review`) for every AI use case.
R2	Evaluate access policy, module entitlement, tenant quota, feature flag, and consent (for PHI-touching features) before any inference. Fail closed.
R3	Route the request to the appropriate provider using a provider-selection matrix (tenant config, feature key, residency, DPIA status, fallback).
R4	Apply pre- and post-moderation (safety classifiers, PHI minimisation, prompt-injection defenses).
R5	Orchestrate HITL: persist model output as `AIDecision` in `draft` state, notify reviewer queue, accept or reject, emit accepted event with provenance.
R6	Stamp every output with `AIProvenance` (model, model version, provider, prompt template id, tenant, actor, correlation, timestamps, policy decision id, moderation outcome).
R7	Emit structured events (`ai.assist.requested/completed/failed`, `ai.decision.created/accepted/rejected`, `ai.moderation.flagged`, `ai.provider.degraded`) for audit and observability.
R8	Enforce per-tenant and per-feature rolling quotas; protect providers with circuit breakers and timeouts.
R9	Redact raw prompts/transcripts from default logs and events (PHI-safe) unless a DPIA-approved retention policy is active.

4. Non-responsibilities

#	Not responsible for
NR1	Persisting accepted AI content into a clinical artifact — the owning module (patient-chart, medication, orders, etc.) accepts the draft and writes the signed record.
NR2	Training or fine-tuning models. Training pipelines live outside the runtime plane.
NR3	Full retrieval-augmented generation (RAG). A thin context-fetch adapter is in scope; a production RAG index is out of scope (tracked in roadmap).
NR4	Standalone ABAC evaluation — delegated to `access-policy` inside identity-service.
NR5	FHIR resource CRUD — owned by interop-service and clinical services.

5. Upstream / downstream dependencies

Direction	Service	Purpose
Upstream (callers)	patient-chart-service	Clinical note summarisation, differential-diagnosis assist
Upstream	medication-service	Drug-interaction narrative, med-reconciliation assist
Upstream	radiology-service	Imaging pre-read assist (never autonomous)
Upstream	laboratory-service	Result narrative / critical-value explanation
Upstream	patient-portal-service	Patient-facing triage / symptom-checker assist
Upstream	virtual-care-service	Encounter summary, SOAP scaffold
Upstream	interop-service	Inbound document classification
Upstream	communication-service	Message-draft assistance
Downstream	identity-service	JWT validation, access-policy evaluation
Downstream	config-service	Feature flags, provider routing matrix, prompt template resolution
Downstream	audit-service	Ingests all `ai.*` events (tamper-evident trail)
Downstream	communication-service	Reviewer queue notifications for HITL
Downstream	External AI providers	Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, on-prem vLLM, local Ollama

6. Slice involvement

Slice	Involvement
S0 (platform foundation)	Mandatory — single AI ingress must exist before any feature uses AI
S1 (EHR core)	Required for clinical-notes summarisation, allergies reconciliation hints
S2 (portal + virtual care)	Required for triage, encounter summary
S3 (population health / research)	Required for cohort-explanation assist; de-identified data plane

7. Key architectural decisions

ADR	Decision	Rationale
ADR-AIGW-01	Single ingress via NestJS 11 + Kong; no browser-side provider keys.	Central policy, cost control, key security.
ADR-AIGW-02	AIDecision aggregate persists draft outputs; owning module accepts to finalise.	Matches clinical-safety guidance: AI output is assistive until signed by clinician or accepted by policy-approved automation.
ADR-AIGW-03	Provider adapters implement a common `ModelProvider` port; selection by `ProviderRoutingRule` at runtime.	Supports Anthropic / OpenAI / Azure / on-prem without caller changes.
ADR-AIGW-04	Events carry `correlationId` and `provenanceId`; never raw prompt text in default subjects.	FR-AI-006, FR-NFR-018 — PHI-safe eventing.
ADR-AIGW-05	Circuit breaker per provider + per feature; graceful fail-closed when policy or moderation unavailable.	FR-NFR-015, FR-NFR-017.
ADR-AIGW-06	HITL queue backed by `AIDecision` rows; reviewer actions audited.	Clinical safety, compliance.

8. System context (mermaid)

9. Canonical flows

Flow	Trigger	Outcome
Assistive draft	`POST /v1/ai/assist` from any service	Returns `{draftText, isDraft:true, provenance, decisionId}` or policy deny
HITL review	Reviewer accepts/rejects via `POST /v1/ai/decisions/:id/review`	Emits `ai.decision.accepted` consumed by owning module
Provider fallback	Primary provider circuit open	Secondary provider used; `ai.provider.degraded` emitted
Moderation block	Pre-moderation flags prompt	422 returned with `AI_MODERATION_BLOCKED`; `ai.moderation.flagged` emitted

10. Source reconciliation

Legacy _sources/ai-orchestrator/ specified a baseline "assist-only" orchestrator with policy + quota + mock provider. This doc widens the remit to a full gateway: moderation, multi-provider routing, HITL, AIProvenance as a first-class aggregate, and cross-service event coverage. Legacy FR IDs (FR-AI-002..006, FR-NFR-015..018) are preserved and mapped to the new FR-AIGW-* namespace in EPICS.md and USER_STORIES.md. AIO-* prefixes are retained in legacy FR columns for traceability.

1. Purpose​

2. Bounded context​

3. Responsibilities​

4. Non-responsibilities​

5. Upstream / downstream dependencies​

6. Slice involvement​

7. Key architectural decisions​

8. System context (mermaid)​

9. Canonical flows​

10. Source reconciliation​