Skip to main content

AI Gateway Service — API Contracts

Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · 05 api-design · standards/API_PATH_CONVENTIONS.md

Base path (behind Kong): /api/v1/ai · Internal base: http://ai-gateway-service:3040

All routes require Authorization: Bearer <JWT> unless noted. JWT issuer: Keycloak realm per tenant. Required claims: sub, tenant_id, scope.

1. Assist

POST /api/v1/ai/assist

Request an assistive draft. Returns draft text + provenance + decision id.

HeaderRequiredDescription
AuthorizationYesBearer JWT
X-Correlation-IdNoUUID; generated if absent
Idempotency-KeyNoULID; replays return cached result

Request body

FieldTypeRequiredConstraints
featureKeystringYes3–128 chars; must be registered
resourceTypestringYesFHIR resource type or domain identifier
nodeIdstringNoHierarchy node scope
inputsobjectYesDomain-specific; may include structured FHIR refs
instructionsstringNoMax 32000 chars; never logged verbatim
promptTemplateRef{key,version}NoOverrides tenant default
residencyHintenumNoAF, AE, EU, US, ON_PREM

200 response

{
"decisionId": "dec_01H...",
"draftText": "string | null",
"isDraft": true,
"hitlRequired": false,
"provenance": {
"provenanceId": "prv_01H...",
"featureKey": "patient_chart.note_summary",
"provider": "anthropic",
"modelVersion": "anthropic:claude-sonnet-4:2026-04-01",
"promptTemplate": { "key": "chart.note.summary", "version": "1.3.0" },
"correlationId": "uuid",
"tenantId": "ten_...",
"actorId": "usr_...",
"policyDecisionId": "pol_...",
"moderation": { "input": "allow", "output": "allow" },
"requestedAt": "ISO8601",
"completedAt": "ISO8601"
}
}

POST /api/v1/ai/assist/stream

Server-Sent Events stream of assist tokens. Same request body. Events: token, moderation, complete, error.

2. Moderation

POST /api/v1/ai/moderate

Standalone moderation (used by patient-portal for free-text before calling assist).

FieldTypeRequired
textstringYes
categoriesstring[]No

200: { "verdict": "allow|flag|block", "categories": [{"name":"self-harm","score":0.02}] }

3. Decisions (HITL)

GET /api/v1/ai/decisions/:id

Returns decision + provenance. Caller must be creator service, reviewer in same facility, or auditor.

POST /api/v1/ai/decisions/:id/review

FieldTypeRequired
verdictaccepted | rejectedYes
commentstringIf rejected
editsstringNo — reviewer may edit draft before accept

Returns updated decision. Emits ai_gateway.decision.accepted.v1 or rejected.v1.

GET /api/v1/ai/decisions

List decisions, filtered.

ParamTypeDefault
stateenum
featureKeystring
facilityIdstringreviewer's facility
from / toISO8601last 7 days
limitint50 (max 500)
cursoropaque

POST /api/v1/ai/decisions/:id/accept

Called by the owning service to signal "draft has been used". Body: { acceptedBy: userId, targetResource: {type,id} }. Idempotent. Emits ai_gateway.decision.accepted.v1.

4. Provenance

GET /api/v1/ai/provenance/:id

Returns immutable provenance. Used by audit-service and clinical services when rendering "AI-assisted" badges.

5. Admin

GET /api/v1/ai/admin/routing-rules

List active routing rules for the calling tenant. Scope: platform_admin or tenant_admin.

PUT /api/v1/ai/admin/routing-rules/:featureKey

Update routing rule. Body: { providers: [{provider, model, residency, priority}], fallback: [...] }.

GET /api/v1/ai/admin/prompt-templates

POST /api/v1/ai/admin/prompt-templates

Register prompt template (scope: platform_admin). Body: { key, version, template, guardrails, featureKeys[] }. Stores only hash in decisions.

GET /api/v1/ai/admin/quotas

Returns { featureKey, windowSec, limit, used }.

GET /api/v1/ai/admin/providers/health

Health per provider: latency p95, error rate, circuit state.

6. Error codes

CodeHTTPMeaning
AI_POLICY_DENY403Access policy denied pre-inference
AI_CONSENT_REQUIRED403PHI feature without consent on file
AI_MODULE_NOT_LICENSED403Tenant not entitled
AI_QUOTA_EXCEEDED429Rolling window exceeded
AI_MODERATION_BLOCKED422Pre-moderation refused
AI_PROVIDER_UNAVAILABLE503All providers failed / circuit open
AI_DECISION_NOT_FOUND404
AI_DECISION_STATE_INVALID409Review on non-under_review decision
CROSS_TENANT403Payload references resources in a foreign tenant
AI_PROMPT_TEMPLATE_NOT_FOUND404Referenced template missing

7. Pagination

Cursor-based: { data: [], nextCursor: "opaque", total?: number }. Max page 500.

8. FHIR mapping

AI outputs are not FHIR resources directly. When an accepted draft is finalised by an owning service, that service writes a Provenance resource linking the clinical resource (target) to this gateway's AIProvenance via agent.who.reference = Device/ai-gateway and entity.what.reference = DocumentReference/prompt-template-{hash}. See interop-service/API_CONTRACTS.md for the full mapping.

9. Rate limiting

Kong enforces per-tenant limits (60 req/min default, override per tenant). In-service quota enforces per-feature and per-window budget (FR-AIGW-QUOTA-001).