AI Gateway Service — API Contracts
Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · 05 api-design · standards/API_PATH_CONVENTIONS.md
Base path (behind Kong): /api/v1/ai · Internal base: http://ai-gateway-service:3040
All routes require Authorization: Bearer <JWT> unless noted. JWT issuer: Keycloak realm per tenant. Required claims: sub, tenant_id, scope.
1. Assist
POST /api/v1/ai/assist
Request an assistive draft. Returns draft text + provenance + decision id.
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer JWT |
X-Correlation-Id | No | UUID; generated if absent |
Idempotency-Key | No | ULID; replays return cached result |
Request body
| Field | Type | Required | Constraints |
|---|---|---|---|
featureKey | string | Yes | 3–128 chars; must be registered |
resourceType | string | Yes | FHIR resource type or domain identifier |
nodeId | string | No | Hierarchy node scope |
inputs | object | Yes | Domain-specific; may include structured FHIR refs |
instructions | string | No | Max 32000 chars; never logged verbatim |
promptTemplateRef | {key,version} | No | Overrides tenant default |
residencyHint | enum | No | AF, AE, EU, US, ON_PREM |
200 response
{
"decisionId": "dec_01H...",
"draftText": "string | null",
"isDraft": true,
"hitlRequired": false,
"provenance": {
"provenanceId": "prv_01H...",
"featureKey": "patient_chart.note_summary",
"provider": "anthropic",
"modelVersion": "anthropic:claude-sonnet-4:2026-04-01",
"promptTemplate": { "key": "chart.note.summary", "version": "1.3.0" },
"correlationId": "uuid",
"tenantId": "ten_...",
"actorId": "usr_...",
"policyDecisionId": "pol_...",
"moderation": { "input": "allow", "output": "allow" },
"requestedAt": "ISO8601",
"completedAt": "ISO8601"
}
}
POST /api/v1/ai/assist/stream
Server-Sent Events stream of assist tokens. Same request body. Events: token, moderation, complete, error.
2. Moderation
POST /api/v1/ai/moderate
Standalone moderation (used by patient-portal for free-text before calling assist).
| Field | Type | Required |
|---|---|---|
text | string | Yes |
categories | string[] | No |
200: { "verdict": "allow|flag|block", "categories": [{"name":"self-harm","score":0.02}] }
3. Decisions (HITL)
GET /api/v1/ai/decisions/:id
Returns decision + provenance. Caller must be creator service, reviewer in same facility, or auditor.
POST /api/v1/ai/decisions/:id/review
| Field | Type | Required |
|---|---|---|
verdict | accepted | rejected | Yes |
comment | string | If rejected |
edits | string | No — reviewer may edit draft before accept |
Returns updated decision. Emits ai_gateway.decision.accepted.v1 or rejected.v1.
GET /api/v1/ai/decisions
List decisions, filtered.
| Param | Type | Default |
|---|---|---|
state | enum | — |
featureKey | string | — |
facilityId | string | reviewer's facility |
from / to | ISO8601 | last 7 days |
limit | int | 50 (max 500) |
cursor | opaque | — |
POST /api/v1/ai/decisions/:id/accept
Called by the owning service to signal "draft has been used". Body: { acceptedBy: userId, targetResource: {type,id} }. Idempotent. Emits ai_gateway.decision.accepted.v1.
4. Provenance
GET /api/v1/ai/provenance/:id
Returns immutable provenance. Used by audit-service and clinical services when rendering "AI-assisted" badges.
5. Admin
GET /api/v1/ai/admin/routing-rules
List active routing rules for the calling tenant. Scope: platform_admin or tenant_admin.
PUT /api/v1/ai/admin/routing-rules/:featureKey
Update routing rule. Body: { providers: [{provider, model, residency, priority}], fallback: [...] }.
GET /api/v1/ai/admin/prompt-templates
POST /api/v1/ai/admin/prompt-templates
Register prompt template (scope: platform_admin). Body: { key, version, template, guardrails, featureKeys[] }. Stores only hash in decisions.
GET /api/v1/ai/admin/quotas
Returns { featureKey, windowSec, limit, used }.
GET /api/v1/ai/admin/providers/health
Health per provider: latency p95, error rate, circuit state.
6. Error codes
| Code | HTTP | Meaning |
|---|---|---|
AI_POLICY_DENY | 403 | Access policy denied pre-inference |
AI_CONSENT_REQUIRED | 403 | PHI feature without consent on file |
AI_MODULE_NOT_LICENSED | 403 | Tenant not entitled |
AI_QUOTA_EXCEEDED | 429 | Rolling window exceeded |
AI_MODERATION_BLOCKED | 422 | Pre-moderation refused |
AI_PROVIDER_UNAVAILABLE | 503 | All providers failed / circuit open |
AI_DECISION_NOT_FOUND | 404 | — |
AI_DECISION_STATE_INVALID | 409 | Review on non-under_review decision |
CROSS_TENANT | 403 | Payload references resources in a foreign tenant |
AI_PROMPT_TEMPLATE_NOT_FOUND | 404 | Referenced template missing |
7. Pagination
Cursor-based: { data: [], nextCursor: "opaque", total?: number }. Max page 500.
8. FHIR mapping
AI outputs are not FHIR resources directly. When an accepted draft is finalised by an owning service, that service writes a Provenance resource linking the clinical resource (target) to this gateway's AIProvenance via agent.who.reference = Device/ai-gateway and entity.what.reference = DocumentReference/prompt-template-{hash}. See interop-service/API_CONTRACTS.md for the full mapping.
9. Rate limiting
Kong enforces per-tenant limits (60 req/min default, override per tenant). In-service quota enforces per-feature and per-window budget (FR-AIGW-QUOTA-001).