AI Gateway Service — User Stories

Service: ai-gateway-service Story prefix: AIGW-US Last updated: 2026-04-18

Stories

AIGW-US-001 — Submit AI assist request via gateway

Field	Value
Issue type	Story
Summary	POST /v1/ai/assist routes request to provider and returns draft with provenance
Epic link	AIGW-EPIC-01
Status	To Do
Priority	Must
Story points	8
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service
FR references	FR-AIGW-001
Legacy FR refs	FR-AI-002
Dependencies	AIGW-US-002, AIGW-US-003

User story: As a clinical service (patient-chart, medication, virtual-care), when I need AI-assisted content generation, I want to POST to /v1/ai/assist with a feature key and payload so that I receive a draft text plus a decisionId and provenanceId without ever touching a raw model API key.

Acceptance criteria (Gherkin):

Given a valid JWT with scope ai.assist and an entitled feature key, when I POST to /v1/ai/assist, then I receive 200 { draftText, decisionId, provenanceId, isDraft: true }.
Given my JWT is invalid or expired, when I POST to /v1/ai/assist, then I receive 401 UNAUTHORIZED.
Given the feature key is not entitled for my tenant, when I POST, then I receive 403 AI_POLICY_DENY and ai_gateway.assist.failed.v1 is emitted.
Given the assist completes, when I query GET /v1/ai/decisions/:id, then I receive the decision with state=draft and a linked AIProvenance row.

Technical notes:

AIDecision and AIProvenance written in a single DB transaction with outbox row.
Tenant extracted from JWT tenantId claim; never from caller body.
Raw prompt text must not appear in NATS event payload or default logs.

Definition of Done:

Unit + integration tests added; coverage ≥ 80 %.
OpenAPI contract updated; Pact consumer tests green.
ai_gateway.assist.requested.v1 and ai_gateway.assist.completed.v1 schemas registered.
Telemetry spans and ai_gateway_assist_duration_ms histogram added.
MIGRATION_PLAN Phase 1 references this story.

AIGW-US-002 — Pre-assist policy evaluation (fail-closed)

Field	Value
Issue type	Story
Summary	Policy check evaluates entitlement, flag, consent before inference
Epic link	AIGW-EPIC-01
Status	To Do
Priority	Must
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service, identity-service
FR references	FR-AIGW-002
Legacy FR refs	FR-AI-003
Dependencies	cross-service: IDENT-US-001

User story: As the platform, when any AI assist is requested, I want the gateway to evaluate access policy, module entitlement, feature flag, and PHI consent before making any provider call so that AI features are gated by the same policy controls as every other platform capability.

Acceptance criteria (Gherkin):

Given the policy service returns deny, when assist is requested, then 403 AI_POLICY_DENY is returned and no provider call is made.
Given the policy service is unreachable (timeout > 2 s), when assist is requested, then the gateway fails closed: 403 AI_POLICY_DENY with reason=POLICY_TIMEOUT.
Given the feature key requires PHI consent and consent record is absent, when assist is requested, then 403 AI_CONSENT_REQUIRED is returned.
Given the feature flag for the feature key is disabled, when assist is requested, then 403 AI_POLICY_DENY with reason=FEATURE_DISABLED.

Technical notes:

Policy evaluation via PolicyClient port → AccessPolicyHttpAdapter.
Feature flag from config-service; cached with 30 s TTL.
Fail-closed is enforced in RequestAssistUseCase before any quota decrement.

Definition of Done:

Unit tests for all deny branches; integration test with mock policy service.
Fail-closed behaviour tested with 100 % coverage on timeout path.
Docs updated in APPLICATION_LOGIC.md §6.

AIGW-US-003 — AIProvenance immutable record

Field	Value
Issue type	Story
Summary	Every assist completion writes an immutable AIProvenance row
Epic link	AIGW-EPIC-01
Status	To Do
Priority	Must
Story points	3
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service
FR references	FR-AIGW-003
Legacy FR refs	FR-AI-006
Dependencies	AIGW-US-001

User story: As a compliance officer, when I audit an AI-assisted clinical record, I want to trace exactly which model, version, prompt template, provider, and moderation outcome produced the content so that I can satisfy regulatory inquiries about AI provenance.

Acceptance criteria (Gherkin):

Given an assist completes, when I query GET /v1/ai/provenance/:provenanceId, then I receive all fields: provider, modelVersion, promptTemplateKey, promptTemplateVersion, promptTemplateHash, moderationInput, latencyMs, requestedAt.
Given a provenance record is created, when any process attempts UPDATE or DELETE on the row, then the DB role ai_gateway_app returns a permission denied error.
Given an assist fails mid-stream, when the transaction rolls back, then no orphaned AIProvenance row exists.

Technical notes:

ai_provenance table: REVOKE UPDATE, DELETE ON ai_provenance FROM ai_gateway_app.
AIProvenance.id uses prv_ ULID prefix.
GetProvenanceQuery is callable by audit-service and owning clinical services.

Definition of Done:

DB migration includes REVOKE statement.
Integration test verifies UPDATE attempt fails with permission error.
Schema conformance test for ai_gateway.assist.completed.v1 (carries provenanceId).

AIGW-US-004 — Consumer service cutover from direct provider

Field	Value
Issue type	Story
Summary	patient-chart-service migrated from direct Anthropic to AI gateway
Epic link	AIGW-EPIC-01
Status	To Do
Priority	Must
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S1
Components	patient-chart-service, ai-gateway-service
FR references	FR-AIGW-001
Legacy FR refs	FR-AI-002
Dependencies	AIGW-US-001, cross-service: CHART-US-012

User story: As a platform engineer, when patient-chart-service needs AI-assist for clinical notes, I want it to call the AI gateway instead of Anthropic directly so that all clinical AI calls are gated by central policy and leave an auditable provenance trail.

Acceptance criteria (Gherkin):

Given patient-chart-service is deployed with AIGatewayHttpAdapter, when a note AI assist is requested, then the gateway is called and the returned provenanceId is stored in NoteAIProvenance.
Given the gateway returns a draft, when the clinician accepts the AI chunk, then patient_chart.note.ai_accepted.v1 includes provenanceId.
Given the direct Anthropic key is revoked for patient-chart-service, when the service starts, then it does not throw a missing key error.

Technical notes:

AIGatewayHttpAdapter implements AIGatewayPort in patient-chart-service.
NoteAIProvenance row written by AcceptAIChunkUseCase.
Migration phase: MIGRATION_PLAN Phase 1.

Definition of Done:

Direct Anthropic SDK dependency removed from patient-chart-service package.json.
Contract test: patient-chart-service consumer Pact against ai-gateway-service provider.
Anthropic key removed from patient-chart-service vault secrets.

AIGW-US-005 — Provider routing rules configuration

Field	Value
Issue type	Story
Summary	Admin configures provider routing rules per tenant and feature key
Epic link	AIGW-EPIC-02
Status	To Do
Priority	Must
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service, config-service
FR references	FR-AIGW-004
Legacy FR refs	—
Dependencies	AIGW-US-001

User story: As a platform admin, when I need to route specific features to specific AI providers based on tenant data-residency requirements, I want to configure ProviderRoutingRule entries so that Afghanistan-residency tenants use the on-prem vLLM while other tenants use Anthropic.

Acceptance criteria (Gherkin):

Given a routing rule with residency=AF mapped to onprem_vllm, when a tenant with residency=AF requests assist, then the vLLM adapter is used.
Given no tenant-specific rule exists, when a request arrives, then the global routing rule is applied.
Given I POST a new routing rule, when the active rule is superseded, then the previous rule transitions to deprecated and the new one to active atomically.

Technical notes:

POST /v1/admin/routing-rules — platform_admin scope.
Tenant-specific override: tenant_admin scope with tenant constraint.
Rules loaded via ConfigClient; hot-reload on version bump.

Definition of Done:

Integration test covers AF residency routing.
Old rule deprecated atomically (single DB transaction).
Schema conformance test for routing rule entity.

AIGW-US-006 — Circuit breaker per provider

Field	Value
Issue type	Story
Summary	Circuit breaker opens on consecutive provider failures and routes to fallback
Epic link	AIGW-EPIC-02
Status	To Do
Priority	Must
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service
FR references	FR-AIGW-005
Legacy FR refs	FR-NFR-015
Dependencies	AIGW-US-005

User story: As the platform, when a primary AI provider experiences an outage, I want circuit breakers to automatically route requests to the fallback provider so that clinical AI features remain available without manual intervention.

Acceptance criteria (Gherkin):

Given the primary provider fails 5 consecutive times within 60 s, when the next assist request arrives, then the circuit opens and the fallback provider is used.
Given the circuit is open, when a request arrives, then ai_gateway.provider.degraded.v1 is emitted once per circuit-open transition.
Given all providers in the routing chain are unavailable, when a request arrives, then 503 AI_PROVIDER_UNAVAILABLE is returned.

Technical notes:

Per-provider bulkhead with in-flight cap.
Circuit state stored in Redis per (tenantId, featureKey, providerId).
Threshold configurable via config-service.

Definition of Done:

Chaos test: kill primary provider container; verify fallback within 500 ms.
ai_gateway.provider.degraded.v1 schema conformance test.

AIGW-US-007 — On-prem vLLM adapter for offline clinics

Field	Value
Issue type	Story
Summary	vLLM adapter routes AF-residency assist to on-prem model
Epic link	AIGW-EPIC-02
Status	To Do
Priority	Should
Story points	8
Labels	service:ai-gateway-service, type:backend, slice:S3
Components	ai-gateway-service
FR references	FR-AIGW-004
Legacy FR refs	FR-NFR-017
Dependencies	AIGW-US-005

User story: As an Afghan MoPH data-residency officer, when AI assist is used in a clinic with residency=AF, I want all inference to happen on the on-prem vLLM deployment so that patient data never leaves Afghanistan.

Acceptance criteria (Gherkin):

Given an AF-residency routing rule, when assist is requested, then ProviderAttempt.provider=onprem_vllm.
Given the vLLM server is unreachable, when assist is requested, then 503 is returned (no fallback to cloud provider for AF-residency features).
Given the vLLM model returns a response, when provenance is written, then residency=AF is stored in ai_provenance.residency.

Technical notes:

VLLMAdapter implements ModelProvider port with OpenAI-compatible API.
AF-residency fallback to cloud is explicitly blocked by routing rule config.

Definition of Done:

Integration test with vLLM mock server.
residency=AF verified in provenance row.
DPIA pre-condition check: story blocked until AF DPIA signed.

AIGW-US-008 — HITL queue for clinical-decision features

Field	Value
Issue type	Story
Summary	HITLPolicy=required queues decision for reviewer before clinical use
Epic link	AIGW-EPIC-03
Status	To Do
Priority	Must
Story points	8
Labels	service:ai-gateway-service, type:backend, slice:S1
Components	ai-gateway-service, communication-service
FR references	FR-AIGW-006
Legacy FR refs	FR-AI-004
Dependencies	AIGW-US-001, cross-service: COMMS-US-001

User story: As a clinical informatics lead, when AI assist produces a draft for a clinical-decision feature, I want the draft to be held in a HITL review queue so that no AI output reaches a patient chart without explicit clinician approval.

Acceptance criteria (Gherkin):

Given a feature key with HITLPolicy=required, when assist completes, then AIDecision.state=under_review and ai_gateway.decision.hitl_queued.v1 is emitted.
Given a decision is under_review, when the configured auto-reject timeout elapses, then AIDecision.state=rejected and ai_gateway.decision.rejected.v1 is emitted.
Given a feature key with HITLPolicy=none, when assist completes, then AIDecision.state transitions directly to accepted and no queue notification is sent.

Technical notes:

Reviewer notification via ReviewerNotifier port → CommunicationServiceAdapter.
Auto-reject timeout configured per feature key in config-service.
decision.hitl_queued.v1 carries featureKey, decisionId, assignedFacilityId (for routing to reviewer).

Definition of Done:

E2E test: note AI assist → HITL queued → reviewer accepts → ai.decision.accepted.v1 received by patient-chart-service.
Auto-reject timer integration test.
Schema conformance: ai_gateway.decision.hitl_queued.v1.

AIGW-US-009 — Reviewer submit accept/reject verdict

Field	Value
Issue type	Story
Summary	Reviewer accepts or rejects queued AI decision via API
Epic link	AIGW-EPIC-03
Status	To Do
Priority	Must
Story points	5
Labels	service:ai-gateway-service, type:api, slice:S1
Components	ai-gateway-service
FR references	FR-AIGW-007
Legacy FR refs	FR-AI-004
Dependencies	AIGW-US-008

User story: As a clinical reviewer, when I access the HITL review queue, I want to accept or reject a queued AI decision so that clinical AI content is either approved for chart use or discarded, with my action recorded in the audit trail.

Acceptance criteria (Gherkin):

Given a decision in under_review state, when I POST /v1/ai/decisions/:id/review with verdict=accepted, then state=accepted, ai_gateway.decision.accepted.v1 is emitted with provenanceId.
Given a decision in under_review, when I POST with verdict=rejected, then state=rejected, ai_gateway.decision.rejected.v1 is emitted.
Given I do not have the reviewer role for the decision's facility, when I attempt review, then 403 FORBIDDEN is returned.
Given an accepted decision, when review is submitted again, then 409 CONFLICT is returned (state machine terminal).

Technical notes:

Optimistic locking on ai_decision.version.
DecisionReviewEvent written in same transaction as state change.
Reviewer scope: ai.review Keycloak scope, facility-scoped.

Definition of Done:

Integration test for accept, reject, and unauthorized reviewer.
Optimistic lock conflict test.
Audit entry written by audit-service on ai_gateway.decision.accepted.v1.

AIGW-US-010 — List HITL review queue for assigned reviewer

Field	Value
Issue type	Story
Summary	Reviewer sees paginated list of decisions awaiting review
Epic link	AIGW-EPIC-03
Status	To Do
Priority	Must
Story points	3
Labels	service:ai-gateway-service, type:api, slice:S1
Components	ai-gateway-service
FR references	FR-AIGW-008
Legacy FR refs	—
Dependencies	AIGW-US-008

User story: As a clinical reviewer, when I open the HITL review interface, I want to see a paginated list of AI decisions pending my review filtered by my assigned facility and feature keys so that I can efficiently process the queue.

Acceptance criteria (Gherkin):

Given I have the ai.review scope, when I call GET /v1/ai/decisions/review-queue, then I receive decisions in under_review for my assigned facilities, paginated, oldest first.
Given no decisions await review, when I call the endpoint, then I receive { data: [], total: 0 }.
Given a super admin, when they call the endpoint with facilityId filter, then they see decisions for that facility.

Technical notes:

Cursor pagination on created_at.
RLS-scoped by tenant_id; facility filter applied in query.

Definition of Done:

API spec updated in API_CONTRACTS.md.
Pact consumer test from reviewer UI.

AIGW-US-011 — Pre-inference input moderation

Field	Value
Issue type	Story
Summary	Moderation classifier screens prompt input before provider call
Epic link	AIGW-EPIC-04
Status	To Do
Priority	Must
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S1
Components	ai-gateway-service, moderation-classifier
FR references	FR-AIGW-009
Legacy FR refs	FR-AI-005
Dependencies	AIGW-US-001

User story: As the platform security layer, when an AI assist input is received, I want it screened by the moderation classifier before the provider call so that malicious prompts and injection attacks are blocked before they reach a model.

Acceptance criteria (Gherkin):

Given an input classified as block by the moderation classifier, when assist is requested, then 422 AI_MODERATION_BLOCKED is returned and ai_gateway.moderation.flagged.v1 is emitted.
Given an input classified as flag, when assist proceeds, then ModerationFinding row is written with verdict=flag and the flag is visible in ai_provenance.moderationInput.
Given the moderation service is unavailable, when assist is requested for a PHI-touching feature, then fail-closed: 503 AI_MODERATION_UNAVAILABLE.

Technical notes:

ModerationClient port → LocalClassifierAdapter (primary) + ProviderModerationAdapter (fallback).
PHI minimisation step precedes moderation call when feature config enables it.

Definition of Done:

Adversarial test suite (20 injection prompts) all blocked.
False-positive rate measured < 1 % on clinical text test set.
ai_gateway.moderation.flagged.v1 schema conformance test.

AIGW-US-012 — Post-inference output moderation

Field	Value
Issue type	Story
Summary	Moderation classifier screens model output before returning to caller
Epic link	AIGW-EPIC-04
Status	To Do
Priority	Must
Story points	3
Labels	service:ai-gateway-service, type:backend, slice:S1
Components	ai-gateway-service
FR references	FR-AIGW-010
Legacy FR refs	FR-AI-005
Dependencies	AIGW-US-011

User story: As the platform safety layer, when a model returns output, I want it screened for unsafe content before it is returned to the caller so that harmful AI-generated text cannot reach a clinical record.

Acceptance criteria (Gherkin):

Given model output classified as block, when caller receives response, then 200 { draftText: null, reason: "OUTPUT_BLOCKED", decisionId } and ai_gateway.moderation.flagged.v1 emitted.
Given model output classified as allow, when caller receives response, then draftText contains the output and ai_provenance.moderationOutput=allow.
Given output moderation fails, when this occurs for a PHI feature, then draftText=null with reason=MODERATION_ERROR; never return unscreened output.

Technical notes:

Post-moderation runs after ai_provenance is written (moderation outcome back-patched).
ModerationFinding row created with stage=output.

Definition of Done:

Output moderation tested with known unsafe content samples.
ai_provenance.moderationOutput verified in integration test.

AIGW-US-013 — Rolling quota enforcement per tenant

Field	Value
Issue type	Story
Summary	Per-tenant rolling window quota enforced via Redis
Epic link	AIGW-EPIC-05
Status	To Do
Priority	Should
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S1
Components	ai-gateway-service
FR references	FR-AIGW-011
Legacy FR refs	FR-AI-006
Dependencies	AIGW-US-001

User story: As a platform SRE, when a tenant exceeds their configured AI assist quota for a rolling window, I want requests to be rejected with a clear 429 response so that runaway AI usage does not cause unexpected provider costs.

Acceptance criteria (Gherkin):

Given a tenant has used 100 % of their quota window, when a new assist request arrives, then 429 AI_QUOTA_EXCEEDED with Retry-After header is returned.
Given a new quota window starts, when a request arrives, then the counter resets and the request is processed.
Given a provider error occurs after quota consume, when compensating decrement runs, then the quota counter is restored (best-effort).

Technical notes:

QuotaStore port → RedisQuotaAdapter with INCR + TTL.
Quota is per (tenantId, featureKey) bucket.
ai_gateway.quota.exceeded.v1 emitted on first exceed in a window.

Definition of Done:

Load test: concurrent requests beyond limit all get 429.
Compensating decrement test.
Schema conformance: ai_gateway.quota.exceeded.v1.

AIGW-US-014 — Quota status dashboard for tenant admin

Field	Value
Issue type	Story
Summary	Tenant admin views current AI quota utilisation via API
Epic link	AIGW-EPIC-05
Status	To Do
Priority	Should
Story points	2
Labels	service:ai-gateway-service, type:api, slice:S1
Components	ai-gateway-service
FR references	FR-AIGW-012
Legacy FR refs	—
Dependencies	AIGW-US-013

User story: As a tenant admin, when I check AI usage, I want to see current quota utilisation per feature key so that I can plan capacity and avoid unexpected service interruptions.

Acceptance criteria (Gherkin):

Given I am a tenant admin, when I call GET /v1/ai/quota, then I receive { featureKey, windowStart, windowSec, limit, used, remaining } per configured feature.
Given quota is at 80 %, when I view the dashboard, then the spend alert is visible in the UI and a Prometheus metric reflects the utilisation.

Technical notes:

GetTenantQuotaQuery reads from QuotaWindow aggregate.
RLS ensures tenant sees only their own quota.

Definition of Done:

API spec updated; Pact consumer test from tenant admin UI.
ai_gateway_quota_utilisation_pct metric publishing.

AIGW-US-015 — Register and version prompt templates

Field	Value
Issue type	Story
Summary	Platform admin registers semver-versioned prompt templates
Epic link	AIGW-EPIC-06
Status	To Do
Priority	Should
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service
FR references	FR-AIGW-013
Legacy FR refs	—
Dependencies	AIGW-US-001

User story: As a platform engineer, when I update a clinical AI prompt template, I want to register it with a new semver version so that the gateway can pin specific features to specific template versions and changes are auditable.

Acceptance criteria (Gherkin):

Given a platform admin POSTs to /v1/admin/prompt-templates, when the template is valid, then a PromptTemplate row is created with status=draft.
Given a template is in draft, when admin publishes it, then status=published and the template hash is logged.
Given a template is referenced by active AIDecision rows, when admin attempts to deprecate it, then 409 TEMPLATE_IN_USE is returned.

Technical notes:

Raw template body stored only in secure registry; DB stores template_hash.
PromptTemplateRef { key, version } pinned in ProviderRoutingRule.

Definition of Done:

Hash verification test (tampered body detected).
Deprecation conflict test.
API spec for admin endpoints.

AIGW-US-016 — Tenant-scoped prompt template override

Field	Value
Issue type	Story
Summary	Tenant admin registers tenant-scoped prompt template override
Epic link	AIGW-EPIC-06
Status	To Do
Priority	Could
Story points	3
Labels	service:ai-gateway-service, type:api, slice:S2
Components	ai-gateway-service
FR references	FR-AIGW-013
Legacy FR refs	—
Dependencies	AIGW-US-015

User story: As a tenant admin (large hospital network), when my clinical informatics team needs a customised prompt for a feature, I want to register a tenant-scoped template override so that my tenant's AI interactions reflect our clinical terminology without impacting other tenants.

Acceptance criteria (Gherkin):

Given a tenant-scoped template for the same key/version exists, when a request from that tenant arrives, then the tenant template takes precedence over the global one.
Given the tenant template is deprecated, when a request arrives, then the global template is used as fallback.

Technical notes:

PromptTemplate.tenantId non-null selects tenant scope; null = global.
Resolution order: tenant-scoped → global.

Definition of Done:

Resolution order integration test.
Tenant isolation test (tenant A template not visible to tenant B).

AIGW-US-017 — OpenTelemetry instrumentation for all AI calls

Field	Value
Issue type	Story
Summary	OTEL spans, metrics, and structured logs for every assist and HITL event
Epic link	AIGW-EPIC-07
Status	To Do
Priority	Must
Story points	5
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service, observability
FR references	FR-AIGW-014
Legacy FR refs	FR-NFR-018
Dependencies	AIGW-US-001

User story: As an SRE, when I investigate a clinical AI issue, I want full distributed traces for every assist request — including policy evaluation, moderation, provider call, and HITL transitions — so that I can pinpoint latency and errors without reading raw logs.

Acceptance criteria (Gherkin):

Given an assist request is made, when I query Grafana Tempo, then I see a trace with spans: ai.gateway.policy_check, ai.gateway.quota_consume, ai.gateway.pre_moderate, ai.gateway.provider_call, ai.gateway.post_moderate, ai.gateway.provenance_write.
Given a provider call completes, when I check Prometheus, then ai_gateway_assist_duration_ms histogram has a new observation bucketed by featureKey and provider.
Given raw prompt text, when it appears in a log line, then CI lint catches the violation (PHI-safe log check).

Technical notes:

@ghasi/telemetry initialized before NestFactory in main.ts.
Span attributes: tenantId, featureKey, provider, decisionId — no promptText.

Definition of Done:

Traces visible in Tempo staging.
ai_gateway_assist_duration_ms histogram publishing in Prometheus.
PHI-safe log test in CI.

AIGW-US-018 — Audit event emission for all AI state transitions

Field	Value
Issue type	Story
Summary	All ai.* domain events reach audit-service for tamper-evident trail
Epic link	AIGW-EPIC-07
Status	To Do
Priority	Must
Story points	3
Labels	service:ai-gateway-service, type:backend, slice:S0
Components	ai-gateway-service, audit-service
FR references	FR-AIGW-015
Legacy FR refs	FR-NFR-018
Dependencies	AIGW-US-001, cross-service: AUDIT-US-001

User story: As a compliance officer, when I audit AI-assisted clinical activity, I want every AI decision state transition to appear in the tamper-evident audit log so that I can produce a complete accounting of AI usage for any regulatory inquiry.

Acceptance criteria (Gherkin):

Given an assist completes, when I query audit-service, then AuditEntry exists with eventType=AI_ASSIST_COMPLETED, resourceId=decisionId, provenanceId in metadata.
Given a HITL decision is accepted, when I query audit-service, then AuditEntry with eventType=AI_DECISION_ACCEPTED exists within 5 s of the event.
Given ai.* events are published to NATS, when the audit-service consumer processes them, then no raw promptText field appears in the metadata column.

Technical notes:

Outbox pattern ensures NATS publish is transactional with DB write.
Audit-service wildcard consumer subscribes to ai_gateway.*.
correlationId and provenanceId are required fields in all ai.* event envelopes.

Definition of Done:

Integration test: verify audit-service receives and stores all 10 ai.* event types.
PHI field absence verified in AuditEntry.metadata test.
Event schema conformance test for every ai_gateway.*.v1 event.

Stories​

AIGW-US-001 — Submit AI assist request via gateway​

AIGW-US-002 — Pre-assist policy evaluation (fail-closed)​

AIGW-US-003 — AIProvenance immutable record​

AIGW-US-004 — Consumer service cutover from direct provider​

AIGW-US-005 — Provider routing rules configuration​

AIGW-US-006 — Circuit breaker per provider​

AIGW-US-007 — On-prem vLLM adapter for offline clinics​

AIGW-US-008 — HITL queue for clinical-decision features​

AIGW-US-009 — Reviewer submit accept/reject verdict​

AIGW-US-010 — List HITL review queue for assigned reviewer​

AIGW-US-011 — Pre-inference input moderation​

AIGW-US-012 — Post-inference output moderation​

AIGW-US-013 — Rolling quota enforcement per tenant​

AIGW-US-014 — Quota status dashboard for tenant admin​

AIGW-US-015 — Register and version prompt templates​

AIGW-US-016 — Tenant-scoped prompt template override​

AIGW-US-017 — OpenTelemetry instrumentation for all AI calls​

AIGW-US-018 — Audit event emission for all AI state transitions​

Stories

AIGW-US-001 — Submit AI assist request via gateway

AIGW-US-002 — Pre-assist policy evaluation (fail-closed)

AIGW-US-003 — AIProvenance immutable record

AIGW-US-004 — Consumer service cutover from direct provider

AIGW-US-005 — Provider routing rules configuration

AIGW-US-006 — Circuit breaker per provider

AIGW-US-007 — On-prem vLLM adapter for offline clinics

AIGW-US-008 — HITL queue for clinical-decision features

AIGW-US-009 — Reviewer submit accept/reject verdict

AIGW-US-010 — List HITL review queue for assigned reviewer

AIGW-US-011 — Pre-inference input moderation

AIGW-US-012 — Post-inference output moderation

AIGW-US-013 — Rolling quota enforcement per tenant

AIGW-US-014 — Quota status dashboard for tenant admin

AIGW-US-015 — Register and version prompt templates

AIGW-US-016 — Tenant-scoped prompt template override

AIGW-US-017 — OpenTelemetry instrumentation for all AI calls

AIGW-US-018 — Audit event emission for all AI state transitions