AI Gateway Service — Application Logic
Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · 04 event-driven
1. Use cases
1.1 Commands
| Command | Handler | Preconditions | Postconditions |
|---|---|---|---|
RequestAssist | RequestAssistUseCase | JWT valid, module entitled, quota available, feature flag on, consent recorded (for PHI features) | AIDecision(draft) persisted, AIProvenance stamped, events emitted |
QueueForReview | QueueForReviewUseCase | Decision is draft and policy requires HITL | State → under_review; notification enqueued |
SubmitReview | SubmitReviewUseCase | Decision is under_review, caller has reviewer role, tenant match | State → accepted or rejected; audit logged |
AcceptDraft | AcceptDraftUseCase | Draft is accepted or draft with no HITL; caller is the creating service | Emits decision.accepted with provenanceId to owning module |
OpenCircuit | OpenCircuitUseCase | Consecutive provider errors above threshold | Routing rule switches to fallback; provider.degraded emitted |
RegisterPromptTemplate | Admin | platform_admin scope | New PromptTemplate row, semver bumped |
UpdateRoutingRule | Admin | platform_admin scope; tenant override allowed for tenant_admin | New ProviderRoutingRule active version |
1.2 Queries
| Query | Handler | Returns |
|---|---|---|
GetDecision | GetDecisionQuery | AIDecision + AIProvenance (reviewer/creator only) |
ListReviewQueue | ListReviewQueueQuery | Paginated decisions in under_review scoped by reviewer's assigned facilities |
GetProvenance | GetProvenanceQuery | Immutable provenance record for a provenanceId (called by audit + owning modules) |
GetTenantQuotaStatus | GetTenantQuotaQuery | Current window counter per feature |
GetRoutingMatrix | GetRoutingMatrixQuery | Active routing rules for a tenant (admin UI) |
2. Orchestration — assist flow
3. Orchestration — HITL review
4. Ports (application layer)
| Port | Kind | Adapter (infrastructure) |
|---|---|---|
AIDecisionRepository | Persistence | PostgresAIDecisionAdapter (Drizzle) |
AIProvenanceRepository | Persistence | PostgresAIProvenanceAdapter |
ProviderRouter | Outbound | ConfigDrivenProviderRouter + per-provider adapters |
ModelProvider (per provider) | Outbound | AnthropicAdapter, OpenAIAdapter, AzureOpenAIAdapter, BedrockAdapter, VLLMAdapter, OllamaAdapter, MockAdapter |
ModerationClient | Outbound | LocalClassifierAdapter (fastText/roberta) + ProviderModerationAdapter |
PolicyClient | Outbound | AccessPolicyHttpAdapter |
ConfigClient | Outbound | ConfigServiceHttpAdapter |
QuotaStore | Outbound | RedisQuotaAdapter (shared multi-instance) |
EventPublisher | Outbound | NatsJetStreamAdapter |
ReviewerNotifier | Outbound | CommunicationServiceAdapter |
Clock | Outbound | SystemClockAdapter |
Tracer | Outbound | OpenTelemetryAdapter |
5. Outbox / inbox patterns
| Pattern | Use |
|---|---|
| Outbox | Assist request → decision persisted + outbox row in same DB transaction → relay publishes to NATS. Guarantees at-least-once delivery. |
| Inbox | Consumers (owning modules) dedupe on eventId before finalising clinical artifact. |
| Saga (light) | HITL lifecycle is not a distributed saga — compensation is not required because drafts carry no side effects until accepted. |
6. Error handling
| Failure | Response | Event |
|---|---|---|
| Policy deny | 403 AI_POLICY_DENY | ai_gateway.assist.failed.v1 reason=POLICY_DENY |
| Quota exceeded | 429 AI_QUOTA_EXCEEDED | ai_gateway.quota.exceeded.v1 |
| Moderation block (input) | 422 AI_MODERATION_BLOCKED | ai_gateway.moderation.flagged.v1 |
| Moderation block (output) | 200 with draftText=null, reason=OUTPUT_BLOCKED | ai_gateway.moderation.flagged.v1 |
| Policy unavailable / timeout | 403 (fail closed) | ai_gateway.assist.failed.v1 reason=POLICY_TIMEOUT |
| Provider unavailable | 503 AI_PROVIDER_UNAVAILABLE (after fallback exhausted) | ai_gateway.provider.degraded.v1 |
| Consent missing (PHI feature) | 403 AI_CONSENT_REQUIRED | ai_gateway.assist.failed.v1 reason=CONSENT_MISSING |
| Cross-tenant ref in payload | 403 CROSS_TENANT | ai_gateway.assist.failed.v1 |
7. Transactional boundaries
- Assist command: single DB transaction writes
ai_decision+ai_provenance+outboxrow. - Quota consume uses Redis
INCRwith window TTL; compensatingDECRon provider error (best-effort). - Review submission: single transaction updates
ai_decision.state, writesdecision_review_event, outbox row.
8. Concurrency
- Assist requests are stateless horizontally scalable; in-flight cap per provider enforced via bulkheads.
- Reviewer submissions use optimistic locking on
ai_decision.version.
9. Open questions
- Should HITL reviewers be grouped by specialty or facility? (default: facility + feature).