AI Gateway Service — User Stories
Service: ai-gateway-service Story prefix: AIGW-US Last updated: 2026-04-18
Stories
AIGW-US-001 — Submit AI assist request via gateway
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | POST /v1/ai/assist routes request to provider and returns draft with provenance |
| Epic link | AIGW-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 8 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-001 |
| Legacy FR refs | FR-AI-002 |
| Dependencies | AIGW-US-002, AIGW-US-003 |
User story:
As a clinical service (patient-chart, medication, virtual-care), when I need AI-assisted content generation, I want to POST to /v1/ai/assist with a feature key and payload so that I receive a draft text plus a decisionId and provenanceId without ever touching a raw model API key.
Acceptance criteria (Gherkin):
- Given a valid JWT with scope
ai.assistand an entitled feature key, when I POST to/v1/ai/assist, then I receive200 { draftText, decisionId, provenanceId, isDraft: true }. - Given my JWT is invalid or expired, when I POST to
/v1/ai/assist, then I receive401 UNAUTHORIZED. - Given the feature key is not entitled for my tenant, when I POST, then I receive
403 AI_POLICY_DENYandai_gateway.assist.failed.v1is emitted. - Given the assist completes, when I query
GET /v1/ai/decisions/:id, then I receive the decision withstate=draftand a linkedAIProvenancerow.
Technical notes:
AIDecisionandAIProvenancewritten in a single DB transaction with outbox row.- Tenant extracted from JWT
tenantIdclaim; never from caller body. - Raw prompt text must not appear in NATS event payload or default logs.
Definition of Done:
- Unit + integration tests added; coverage ≥ 80 %.
- OpenAPI contract updated; Pact consumer tests green.
ai_gateway.assist.requested.v1andai_gateway.assist.completed.v1schemas registered.- Telemetry spans and
ai_gateway_assist_duration_mshistogram added. - MIGRATION_PLAN Phase 1 references this story.
AIGW-US-002 — Pre-assist policy evaluation (fail-closed)
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Policy check evaluates entitlement, flag, consent before inference |
| Epic link | AIGW-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service, identity-service |
| FR references | FR-AIGW-002 |
| Legacy FR refs | FR-AI-003 |
| Dependencies | cross-service: IDENT-US-001 |
User story: As the platform, when any AI assist is requested, I want the gateway to evaluate access policy, module entitlement, feature flag, and PHI consent before making any provider call so that AI features are gated by the same policy controls as every other platform capability.
Acceptance criteria (Gherkin):
- Given the policy service returns deny, when assist is requested, then
403 AI_POLICY_DENYis returned and no provider call is made. - Given the policy service is unreachable (timeout > 2 s), when assist is requested, then the gateway fails closed:
403 AI_POLICY_DENYwithreason=POLICY_TIMEOUT. - Given the feature key requires PHI consent and consent record is absent, when assist is requested, then
403 AI_CONSENT_REQUIREDis returned. - Given the feature flag for the feature key is disabled, when assist is requested, then
403 AI_POLICY_DENYwithreason=FEATURE_DISABLED.
Technical notes:
- Policy evaluation via
PolicyClientport →AccessPolicyHttpAdapter. - Feature flag from config-service; cached with 30 s TTL.
- Fail-closed is enforced in
RequestAssistUseCasebefore any quota decrement.
Definition of Done:
- Unit tests for all deny branches; integration test with mock policy service.
- Fail-closed behaviour tested with 100 % coverage on timeout path.
- Docs updated in
APPLICATION_LOGIC.md§6.
AIGW-US-003 — AIProvenance immutable record
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Every assist completion writes an immutable AIProvenance row |
| Epic link | AIGW-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-003 |
| Legacy FR refs | FR-AI-006 |
| Dependencies | AIGW-US-001 |
User story: As a compliance officer, when I audit an AI-assisted clinical record, I want to trace exactly which model, version, prompt template, provider, and moderation outcome produced the content so that I can satisfy regulatory inquiries about AI provenance.
Acceptance criteria (Gherkin):
- Given an assist completes, when I query
GET /v1/ai/provenance/:provenanceId, then I receive all fields: provider, modelVersion, promptTemplateKey, promptTemplateVersion, promptTemplateHash, moderationInput, latencyMs, requestedAt. - Given a provenance record is created, when any process attempts UPDATE or DELETE on the row, then the DB role
ai_gateway_appreturns a permission denied error. - Given an assist fails mid-stream, when the transaction rolls back, then no orphaned AIProvenance row exists.
Technical notes:
ai_provenancetable:REVOKE UPDATE, DELETE ON ai_provenance FROM ai_gateway_app.AIProvenance.idusesprv_ULID prefix.GetProvenanceQueryis callable by audit-service and owning clinical services.
Definition of Done:
- DB migration includes REVOKE statement.
- Integration test verifies UPDATE attempt fails with permission error.
- Schema conformance test for
ai_gateway.assist.completed.v1(carriesprovenanceId).
AIGW-US-004 — Consumer service cutover from direct provider
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | patient-chart-service migrated from direct Anthropic to AI gateway |
| Epic link | AIGW-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S1 |
| Components | patient-chart-service, ai-gateway-service |
| FR references | FR-AIGW-001 |
| Legacy FR refs | FR-AI-002 |
| Dependencies | AIGW-US-001, cross-service: CHART-US-012 |
User story: As a platform engineer, when patient-chart-service needs AI-assist for clinical notes, I want it to call the AI gateway instead of Anthropic directly so that all clinical AI calls are gated by central policy and leave an auditable provenance trail.
Acceptance criteria (Gherkin):
- Given patient-chart-service is deployed with
AIGatewayHttpAdapter, when a note AI assist is requested, then the gateway is called and the returnedprovenanceIdis stored inNoteAIProvenance. - Given the gateway returns a draft, when the clinician accepts the AI chunk, then
patient_chart.note.ai_accepted.v1includesprovenanceId. - Given the direct Anthropic key is revoked for patient-chart-service, when the service starts, then it does not throw a missing key error.
Technical notes:
AIGatewayHttpAdapterimplementsAIGatewayPortin patient-chart-service.NoteAIProvenancerow written byAcceptAIChunkUseCase.- Migration phase: MIGRATION_PLAN Phase 1.
Definition of Done:
- Direct Anthropic SDK dependency removed from patient-chart-service
package.json. - Contract test: patient-chart-service consumer Pact against ai-gateway-service provider.
- Anthropic key removed from patient-chart-service vault secrets.
AIGW-US-005 — Provider routing rules configuration
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Admin configures provider routing rules per tenant and feature key |
| Epic link | AIGW-EPIC-02 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service, config-service |
| FR references | FR-AIGW-004 |
| Legacy FR refs | — |
| Dependencies | AIGW-US-001 |
User story:
As a platform admin, when I need to route specific features to specific AI providers based on tenant data-residency requirements, I want to configure ProviderRoutingRule entries so that Afghanistan-residency tenants use the on-prem vLLM while other tenants use Anthropic.
Acceptance criteria (Gherkin):
- Given a routing rule with
residency=AFmapped toonprem_vllm, when a tenant withresidency=AFrequests assist, then the vLLM adapter is used. - Given no tenant-specific rule exists, when a request arrives, then the global routing rule is applied.
- Given I POST a new routing rule, when the active rule is superseded, then the previous rule transitions to
deprecatedand the new one toactiveatomically.
Technical notes:
POST /v1/admin/routing-rules—platform_adminscope.- Tenant-specific override:
tenant_adminscope with tenant constraint. - Rules loaded via
ConfigClient; hot-reload on version bump.
Definition of Done:
- Integration test covers AF residency routing.
- Old rule deprecated atomically (single DB transaction).
- Schema conformance test for routing rule entity.
AIGW-US-006 — Circuit breaker per provider
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Circuit breaker opens on consecutive provider failures and routes to fallback |
| Epic link | AIGW-EPIC-02 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-005 |
| Legacy FR refs | FR-NFR-015 |
| Dependencies | AIGW-US-005 |
User story: As the platform, when a primary AI provider experiences an outage, I want circuit breakers to automatically route requests to the fallback provider so that clinical AI features remain available without manual intervention.
Acceptance criteria (Gherkin):
- Given the primary provider fails 5 consecutive times within 60 s, when the next assist request arrives, then the circuit opens and the fallback provider is used.
- Given the circuit is open, when a request arrives, then
ai_gateway.provider.degraded.v1is emitted once per circuit-open transition. - Given all providers in the routing chain are unavailable, when a request arrives, then
503 AI_PROVIDER_UNAVAILABLEis returned.
Technical notes:
- Per-provider bulkhead with in-flight cap.
- Circuit state stored in Redis per
(tenantId, featureKey, providerId). - Threshold configurable via config-service.
Definition of Done:
- Chaos test: kill primary provider container; verify fallback within 500 ms.
ai_gateway.provider.degraded.v1schema conformance test.
AIGW-US-007 — On-prem vLLM adapter for offline clinics
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | vLLM adapter routes AF-residency assist to on-prem model |
| Epic link | AIGW-EPIC-02 |
| Status | To Do |
| Priority | Should |
| Story points | 8 |
| Labels | service:ai-gateway-service, type:backend, slice:S3 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-004 |
| Legacy FR refs | FR-NFR-017 |
| Dependencies | AIGW-US-005 |
User story:
As an Afghan MoPH data-residency officer, when AI assist is used in a clinic with residency=AF, I want all inference to happen on the on-prem vLLM deployment so that patient data never leaves Afghanistan.
Acceptance criteria (Gherkin):
- Given an AF-residency routing rule, when assist is requested, then
ProviderAttempt.provider=onprem_vllm. - Given the vLLM server is unreachable, when assist is requested, then
503is returned (no fallback to cloud provider for AF-residency features). - Given the vLLM model returns a response, when provenance is written, then
residency=AFis stored inai_provenance.residency.
Technical notes:
VLLMAdapterimplementsModelProviderport with OpenAI-compatible API.- AF-residency fallback to cloud is explicitly blocked by routing rule config.
Definition of Done:
- Integration test with vLLM mock server.
residency=AFverified in provenance row.- DPIA pre-condition check: story blocked until AF DPIA signed.
AIGW-US-008 — HITL queue for clinical-decision features
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | HITLPolicy=required queues decision for reviewer before clinical use |
| Epic link | AIGW-EPIC-03 |
| Status | To Do |
| Priority | Must |
| Story points | 8 |
| Labels | service:ai-gateway-service, type:backend, slice:S1 |
| Components | ai-gateway-service, communication-service |
| FR references | FR-AIGW-006 |
| Legacy FR refs | FR-AI-004 |
| Dependencies | AIGW-US-001, cross-service: COMMS-US-001 |
User story: As a clinical informatics lead, when AI assist produces a draft for a clinical-decision feature, I want the draft to be held in a HITL review queue so that no AI output reaches a patient chart without explicit clinician approval.
Acceptance criteria (Gherkin):
- Given a feature key with
HITLPolicy=required, when assist completes, thenAIDecision.state=under_reviewandai_gateway.decision.hitl_queued.v1is emitted. - Given a decision is
under_review, when the configured auto-reject timeout elapses, thenAIDecision.state=rejectedandai_gateway.decision.rejected.v1is emitted. - Given a feature key with
HITLPolicy=none, when assist completes, thenAIDecision.statetransitions directly toacceptedand no queue notification is sent.
Technical notes:
- Reviewer notification via
ReviewerNotifierport →CommunicationServiceAdapter. - Auto-reject timeout configured per feature key in config-service.
decision.hitl_queued.v1carriesfeatureKey,decisionId,assignedFacilityId(for routing to reviewer).
Definition of Done:
- E2E test: note AI assist → HITL queued → reviewer accepts →
ai.decision.accepted.v1received by patient-chart-service. - Auto-reject timer integration test.
- Schema conformance:
ai_gateway.decision.hitl_queued.v1.
AIGW-US-009 — Reviewer submit accept/reject verdict
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Reviewer accepts or rejects queued AI decision via API |
| Epic link | AIGW-EPIC-03 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:api, slice:S1 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-007 |
| Legacy FR refs | FR-AI-004 |
| Dependencies | AIGW-US-008 |
User story: As a clinical reviewer, when I access the HITL review queue, I want to accept or reject a queued AI decision so that clinical AI content is either approved for chart use or discarded, with my action recorded in the audit trail.
Acceptance criteria (Gherkin):
- Given a decision in
under_reviewstate, when I POST/v1/ai/decisions/:id/reviewwithverdict=accepted, thenstate=accepted,ai_gateway.decision.accepted.v1is emitted withprovenanceId. - Given a decision in
under_review, when I POST withverdict=rejected, thenstate=rejected,ai_gateway.decision.rejected.v1is emitted. - Given I do not have the
reviewerrole for the decision's facility, when I attempt review, then403 FORBIDDENis returned. - Given an accepted decision, when review is submitted again, then
409 CONFLICTis returned (state machine terminal).
Technical notes:
- Optimistic locking on
ai_decision.version. DecisionReviewEventwritten in same transaction as state change.- Reviewer scope:
ai.reviewKeycloak scope, facility-scoped.
Definition of Done:
- Integration test for accept, reject, and unauthorized reviewer.
- Optimistic lock conflict test.
- Audit entry written by audit-service on
ai_gateway.decision.accepted.v1.
AIGW-US-010 — List HITL review queue for assigned reviewer
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Reviewer sees paginated list of decisions awaiting review |
| Epic link | AIGW-EPIC-03 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:ai-gateway-service, type:api, slice:S1 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-008 |
| Legacy FR refs | — |
| Dependencies | AIGW-US-008 |
User story: As a clinical reviewer, when I open the HITL review interface, I want to see a paginated list of AI decisions pending my review filtered by my assigned facility and feature keys so that I can efficiently process the queue.
Acceptance criteria (Gherkin):
- Given I have the
ai.reviewscope, when I callGET /v1/ai/decisions/review-queue, then I receive decisions inunder_reviewfor my assigned facilities, paginated, oldest first. - Given no decisions await review, when I call the endpoint, then I receive
{ data: [], total: 0 }. - Given a super admin, when they call the endpoint with
facilityIdfilter, then they see decisions for that facility.
Technical notes:
- Cursor pagination on
created_at. - RLS-scoped by
tenant_id; facility filter applied in query.
Definition of Done:
- API spec updated in API_CONTRACTS.md.
- Pact consumer test from reviewer UI.
AIGW-US-011 — Pre-inference input moderation
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Moderation classifier screens prompt input before provider call |
| Epic link | AIGW-EPIC-04 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S1 |
| Components | ai-gateway-service, moderation-classifier |
| FR references | FR-AIGW-009 |
| Legacy FR refs | FR-AI-005 |
| Dependencies | AIGW-US-001 |
User story: As the platform security layer, when an AI assist input is received, I want it screened by the moderation classifier before the provider call so that malicious prompts and injection attacks are blocked before they reach a model.
Acceptance criteria (Gherkin):
- Given an input classified as
blockby the moderation classifier, when assist is requested, then422 AI_MODERATION_BLOCKEDis returned andai_gateway.moderation.flagged.v1is emitted. - Given an input classified as
flag, when assist proceeds, thenModerationFindingrow is written withverdict=flagand the flag is visible inai_provenance.moderationInput. - Given the moderation service is unavailable, when assist is requested for a PHI-touching feature, then fail-closed:
503 AI_MODERATION_UNAVAILABLE.
Technical notes:
ModerationClientport →LocalClassifierAdapter(primary) +ProviderModerationAdapter(fallback).- PHI minimisation step precedes moderation call when feature config enables it.
Definition of Done:
- Adversarial test suite (20 injection prompts) all blocked.
- False-positive rate measured < 1 % on clinical text test set.
ai_gateway.moderation.flagged.v1schema conformance test.
AIGW-US-012 — Post-inference output moderation
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Moderation classifier screens model output before returning to caller |
| Epic link | AIGW-EPIC-04 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:ai-gateway-service, type:backend, slice:S1 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-010 |
| Legacy FR refs | FR-AI-005 |
| Dependencies | AIGW-US-011 |
User story: As the platform safety layer, when a model returns output, I want it screened for unsafe content before it is returned to the caller so that harmful AI-generated text cannot reach a clinical record.
Acceptance criteria (Gherkin):
- Given model output classified as
block, when caller receives response, then200 { draftText: null, reason: "OUTPUT_BLOCKED", decisionId }andai_gateway.moderation.flagged.v1emitted. - Given model output classified as
allow, when caller receives response, thendraftTextcontains the output andai_provenance.moderationOutput=allow. - Given output moderation fails, when this occurs for a PHI feature, then
draftText=nullwithreason=MODERATION_ERROR; never return unscreened output.
Technical notes:
- Post-moderation runs after
ai_provenanceis written (moderation outcome back-patched). ModerationFindingrow created withstage=output.
Definition of Done:
- Output moderation tested with known unsafe content samples.
ai_provenance.moderationOutputverified in integration test.
AIGW-US-013 — Rolling quota enforcement per tenant
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Per-tenant rolling window quota enforced via Redis |
| Epic link | AIGW-EPIC-05 |
| Status | To Do |
| Priority | Should |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S1 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-011 |
| Legacy FR refs | FR-AI-006 |
| Dependencies | AIGW-US-001 |
User story: As a platform SRE, when a tenant exceeds their configured AI assist quota for a rolling window, I want requests to be rejected with a clear 429 response so that runaway AI usage does not cause unexpected provider costs.
Acceptance criteria (Gherkin):
- Given a tenant has used 100 % of their quota window, when a new assist request arrives, then
429 AI_QUOTA_EXCEEDEDwithRetry-Afterheader is returned. - Given a new quota window starts, when a request arrives, then the counter resets and the request is processed.
- Given a provider error occurs after quota consume, when compensating decrement runs, then the quota counter is restored (best-effort).
Technical notes:
QuotaStoreport →RedisQuotaAdapterwithINCR+ TTL.- Quota is per
(tenantId, featureKey)bucket. ai_gateway.quota.exceeded.v1emitted on first exceed in a window.
Definition of Done:
- Load test: concurrent requests beyond limit all get 429.
- Compensating decrement test.
- Schema conformance:
ai_gateway.quota.exceeded.v1.
AIGW-US-014 — Quota status dashboard for tenant admin
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Tenant admin views current AI quota utilisation via API |
| Epic link | AIGW-EPIC-05 |
| Status | To Do |
| Priority | Should |
| Story points | 2 |
| Labels | service:ai-gateway-service, type:api, slice:S1 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-012 |
| Legacy FR refs | — |
| Dependencies | AIGW-US-013 |
User story: As a tenant admin, when I check AI usage, I want to see current quota utilisation per feature key so that I can plan capacity and avoid unexpected service interruptions.
Acceptance criteria (Gherkin):
- Given I am a tenant admin, when I call
GET /v1/ai/quota, then I receive{ featureKey, windowStart, windowSec, limit, used, remaining }per configured feature. - Given quota is at 80 %, when I view the dashboard, then the spend alert is visible in the UI and a Prometheus metric reflects the utilisation.
Technical notes:
GetTenantQuotaQueryreads fromQuotaWindowaggregate.- RLS ensures tenant sees only their own quota.
Definition of Done:
- API spec updated; Pact consumer test from tenant admin UI.
ai_gateway_quota_utilisation_pctmetric publishing.
AIGW-US-015 — Register and version prompt templates
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Platform admin registers semver-versioned prompt templates |
| Epic link | AIGW-EPIC-06 |
| Status | To Do |
| Priority | Should |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-013 |
| Legacy FR refs | — |
| Dependencies | AIGW-US-001 |
User story: As a platform engineer, when I update a clinical AI prompt template, I want to register it with a new semver version so that the gateway can pin specific features to specific template versions and changes are auditable.
Acceptance criteria (Gherkin):
- Given a platform admin POSTs to
/v1/admin/prompt-templates, when the template is valid, then aPromptTemplaterow is created withstatus=draft. - Given a template is in
draft, when admin publishes it, thenstatus=publishedand the template hash is logged. - Given a template is referenced by active
AIDecisionrows, when admin attempts to deprecate it, then409 TEMPLATE_IN_USEis returned.
Technical notes:
- Raw template body stored only in secure registry; DB stores
template_hash. PromptTemplateRef { key, version }pinned inProviderRoutingRule.
Definition of Done:
- Hash verification test (tampered body detected).
- Deprecation conflict test.
- API spec for admin endpoints.
AIGW-US-016 — Tenant-scoped prompt template override
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Tenant admin registers tenant-scoped prompt template override |
| Epic link | AIGW-EPIC-06 |
| Status | To Do |
| Priority | Could |
| Story points | 3 |
| Labels | service:ai-gateway-service, type:api, slice:S2 |
| Components | ai-gateway-service |
| FR references | FR-AIGW-013 |
| Legacy FR refs | — |
| Dependencies | AIGW-US-015 |
User story: As a tenant admin (large hospital network), when my clinical informatics team needs a customised prompt for a feature, I want to register a tenant-scoped template override so that my tenant's AI interactions reflect our clinical terminology without impacting other tenants.
Acceptance criteria (Gherkin):
- Given a tenant-scoped template for the same key/version exists, when a request from that tenant arrives, then the tenant template takes precedence over the global one.
- Given the tenant template is deprecated, when a request arrives, then the global template is used as fallback.
Technical notes:
PromptTemplate.tenantIdnon-null selects tenant scope;null= global.- Resolution order: tenant-scoped → global.
Definition of Done:
- Resolution order integration test.
- Tenant isolation test (tenant A template not visible to tenant B).
AIGW-US-017 — OpenTelemetry instrumentation for all AI calls
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | OTEL spans, metrics, and structured logs for every assist and HITL event |
| Epic link | AIGW-EPIC-07 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service, observability |
| FR references | FR-AIGW-014 |
| Legacy FR refs | FR-NFR-018 |
| Dependencies | AIGW-US-001 |
User story: As an SRE, when I investigate a clinical AI issue, I want full distributed traces for every assist request — including policy evaluation, moderation, provider call, and HITL transitions — so that I can pinpoint latency and errors without reading raw logs.
Acceptance criteria (Gherkin):
- Given an assist request is made, when I query Grafana Tempo, then I see a trace with spans:
ai.gateway.policy_check,ai.gateway.quota_consume,ai.gateway.pre_moderate,ai.gateway.provider_call,ai.gateway.post_moderate,ai.gateway.provenance_write. - Given a provider call completes, when I check Prometheus, then
ai_gateway_assist_duration_mshistogram has a new observation bucketed byfeatureKeyandprovider. - Given raw prompt text, when it appears in a log line, then CI lint catches the violation (PHI-safe log check).
Technical notes:
@ghasi/telemetryinitialized beforeNestFactoryinmain.ts.- Span attributes:
tenantId,featureKey,provider,decisionId— nopromptText.
Definition of Done:
- Traces visible in Tempo staging.
ai_gateway_assist_duration_mshistogram publishing in Prometheus.- PHI-safe log test in CI.
AIGW-US-018 — Audit event emission for all AI state transitions
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | All ai.* domain events reach audit-service for tamper-evident trail |
| Epic link | AIGW-EPIC-07 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:ai-gateway-service, type:backend, slice:S0 |
| Components | ai-gateway-service, audit-service |
| FR references | FR-AIGW-015 |
| Legacy FR refs | FR-NFR-018 |
| Dependencies | AIGW-US-001, cross-service: AUDIT-US-001 |
User story: As a compliance officer, when I audit AI-assisted clinical activity, I want every AI decision state transition to appear in the tamper-evident audit log so that I can produce a complete accounting of AI usage for any regulatory inquiry.
Acceptance criteria (Gherkin):
- Given an assist completes, when I query audit-service, then
AuditEntryexists witheventType=AI_ASSIST_COMPLETED,resourceId=decisionId,provenanceIdin metadata. - Given a HITL decision is accepted, when I query audit-service, then
AuditEntrywitheventType=AI_DECISION_ACCEPTEDexists within 5 s of the event. - Given
ai.*events are published to NATS, when the audit-service consumer processes them, then no rawpromptTextfield appears in themetadatacolumn.
Technical notes:
- Outbox pattern ensures NATS publish is transactional with DB write.
- Audit-service wildcard consumer subscribes to
ai_gateway.*. correlationIdandprovenanceIdare required fields in allai.*event envelopes.
Definition of Done:
- Integration test: verify audit-service receives and stores all 10
ai.*event types. - PHI field absence verified in
AuditEntry.metadatatest. - Event schema conformance test for every
ai_gateway.*.v1event.