Skip to main content

AI Gateway Service — Application Logic

Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · 04 event-driven

1. Use cases

1.1 Commands

CommandHandlerPreconditionsPostconditions
RequestAssistRequestAssistUseCaseJWT valid, module entitled, quota available, feature flag on, consent recorded (for PHI features)AIDecision(draft) persisted, AIProvenance stamped, events emitted
QueueForReviewQueueForReviewUseCaseDecision is draft and policy requires HITLState → under_review; notification enqueued
SubmitReviewSubmitReviewUseCaseDecision is under_review, caller has reviewer role, tenant matchState → accepted or rejected; audit logged
AcceptDraftAcceptDraftUseCaseDraft is accepted or draft with no HITL; caller is the creating serviceEmits decision.accepted with provenanceId to owning module
OpenCircuitOpenCircuitUseCaseConsecutive provider errors above thresholdRouting rule switches to fallback; provider.degraded emitted
RegisterPromptTemplateAdminplatform_admin scopeNew PromptTemplate row, semver bumped
UpdateRoutingRuleAdminplatform_admin scope; tenant override allowed for tenant_adminNew ProviderRoutingRule active version

1.2 Queries

QueryHandlerReturns
GetDecisionGetDecisionQueryAIDecision + AIProvenance (reviewer/creator only)
ListReviewQueueListReviewQueueQueryPaginated decisions in under_review scoped by reviewer's assigned facilities
GetProvenanceGetProvenanceQueryImmutable provenance record for a provenanceId (called by audit + owning modules)
GetTenantQuotaStatusGetTenantQuotaQueryCurrent window counter per feature
GetRoutingMatrixGetRoutingMatrixQueryActive routing rules for a tenant (admin UI)

2. Orchestration — assist flow

3. Orchestration — HITL review

4. Ports (application layer)

PortKindAdapter (infrastructure)
AIDecisionRepositoryPersistencePostgresAIDecisionAdapter (Drizzle)
AIProvenanceRepositoryPersistencePostgresAIProvenanceAdapter
ProviderRouterOutboundConfigDrivenProviderRouter + per-provider adapters
ModelProvider (per provider)OutboundAnthropicAdapter, OpenAIAdapter, AzureOpenAIAdapter, BedrockAdapter, VLLMAdapter, OllamaAdapter, MockAdapter
ModerationClientOutboundLocalClassifierAdapter (fastText/roberta) + ProviderModerationAdapter
PolicyClientOutboundAccessPolicyHttpAdapter
ConfigClientOutboundConfigServiceHttpAdapter
QuotaStoreOutboundRedisQuotaAdapter (shared multi-instance)
EventPublisherOutboundNatsJetStreamAdapter
ReviewerNotifierOutboundCommunicationServiceAdapter
ClockOutboundSystemClockAdapter
TracerOutboundOpenTelemetryAdapter

5. Outbox / inbox patterns

PatternUse
OutboxAssist request → decision persisted + outbox row in same DB transaction → relay publishes to NATS. Guarantees at-least-once delivery.
InboxConsumers (owning modules) dedupe on eventId before finalising clinical artifact.
Saga (light)HITL lifecycle is not a distributed saga — compensation is not required because drafts carry no side effects until accepted.

6. Error handling

FailureResponseEvent
Policy deny403 AI_POLICY_DENYai_gateway.assist.failed.v1 reason=POLICY_DENY
Quota exceeded429 AI_QUOTA_EXCEEDEDai_gateway.quota.exceeded.v1
Moderation block (input)422 AI_MODERATION_BLOCKEDai_gateway.moderation.flagged.v1
Moderation block (output)200 with draftText=null, reason=OUTPUT_BLOCKEDai_gateway.moderation.flagged.v1
Policy unavailable / timeout403 (fail closed)ai_gateway.assist.failed.v1 reason=POLICY_TIMEOUT
Provider unavailable503 AI_PROVIDER_UNAVAILABLE (after fallback exhausted)ai_gateway.provider.degraded.v1
Consent missing (PHI feature)403 AI_CONSENT_REQUIREDai_gateway.assist.failed.v1 reason=CONSENT_MISSING
Cross-tenant ref in payload403 CROSS_TENANTai_gateway.assist.failed.v1

7. Transactional boundaries

  • Assist command: single DB transaction writes ai_decision + ai_provenance + outbox row.
  • Quota consume uses Redis INCR with window TTL; compensating DECR on provider error (best-effort).
  • Review submission: single transaction updates ai_decision.state, writes decision_review_event, outbox row.

8. Concurrency

  • Assist requests are stateless horizontally scalable; in-flight cap per provider enforced via bulkheads.
  • Reviewer submissions use optimistic locking on ai_decision.version.

9. Open questions

  • Should HITL reviewers be grouped by specialty or facility? (default: facility + feature).