AI Gateway Service — Migration Plan
Status: populated Owner: TBD Last updated: 2026-04-18 Companion: Service Template · 03 platform-services
1. Migration context
The AI Gateway Service replaces a lightweight "assist-only orchestrator" described in _sources/ai-orchestrator/. The legacy orchestrator had no persistent state, no HITL, no multi-provider routing, and no moderation. Migration involves:
- Replacing inline model-client code in consumer services with calls to the gateway.
- Backfilling
AIProvenancerecords for AI-assisted content accepted before the gateway existed. - Retiring direct model API keys held by consumer services.
2. Legacy state inventory
| Legacy artifact | Location | Migration action |
|---|---|---|
Anthropic SDK calls in clinical-notes module | _sources/clinical-notes/ adapters | Replace with POST /v1/ai/assist call |
| Prompt strings inlined in service code | Various consumer services | Extract to PromptTemplate rows seeded at M0 |
Direct API keys in consumer .env files | CI/CD secrets | Revoke after gateway goes live; gateway uses vault-sourced keys |
| Mock provider stubs in integration tests | Consumer test suites | Swap to gateway mock adapter (provider=mock) |
Legacy FR-AI-002..006, FR-NFR-015..018 | _sources/ai-orchestrator/ specs | Preserved in legacy FR column; mapped to FR-AIGW-* |
3. Phase plan
4. Per-phase migration tasks
Phase 0 — Infrastructure (M0)
| Task | Owner | Done when |
|---|---|---|
Deploy gateway service behind Kong (/v1/ai/*) | Platform SRE | Routes return 200 from health endpoint |
Seed global ProviderRoutingRule (Anthropic primary, mock fallback) | Platform Eng | Rule visible in admin UI |
Seed baseline PromptTemplate rows for patient_chart.note_summary, portal.triage, virtual_care.soap_scaffold | Platform Eng + Clinical Informatics | Templates published, hash logged |
Configure Keycloak scope ai.assist on consumer service accounts | Identity team | Services can authenticate |
| Gateway observability live (OTEL traces, NATS events to audit-service) | SRE | Grafana dashboard green |
Phase 1 — Consumer service cutover (M1)
| Task | Notes |
|---|---|
patient-chart-service: replace AnthropicAdapter with AIGatewayHttpAdapter | Use POST /v1/ai/assist; store returned provenanceId in NoteAIProvenance row |
| medication-service: cut over med-reconciliation assist | Feature key medication.reconciliation_assist |
| virtual-care-service: cut over encounter-summary assist | Feature key virtual_care.soap_scaffold |
| Revoke consumer-service Anthropic keys in CI/CD vault | Coordinate with security team |
| Run parallel shadow verification for 1 week | Gate on P95 latency parity |
Phase 2 — HITL + Moderation (M2)
| Task | Notes |
|---|---|
| Deploy moderation classifier (FastText/RoBERTa container) | Required for patient_chart.note_summary |
Set HITLPolicy=required for patient_chart.note_summary feature key | Clinical Informatics approval required |
| Configure reviewer queue assignments (facility + specialty) | provider-directory-service integration |
Backfill AIProvenance stubs for pre-gateway AI-assist content | One-time migration script; inserts with legacy_backfill=true flag |
Phase 3 — Multi-provider + Quota (M3)
| Task | Notes |
|---|---|
Add AzureOpenAIAdapter with AF-residency routing | Data-residency DPIA must be signed first |
Add BedrockAdapter for radiology pre-read feature | Radiology team approval required |
Add VLLMAdapter for on-prem deployment (offline clinics) | Requires on-prem inference server; tracked separately |
| Enable per-tenant quota enforcement via Redis | Config: quota.enabled=true per tenant |
5. Rollback plan
| Phase | Rollback action | Risk |
|---|---|---|
| Phase 1 | Re-enable direct Anthropic key in consumer service env; deploy previous version | Low — keys retained in vault for 30 days post-cutover |
| Phase 2 | Disable moderation / HITL via feature flag; gateway passes through without review | Medium — compliance window opens; log and alert |
| Phase 3 | Route back to primary Anthropic provider; disable multi-provider | Low — routing rules versioned |
6. Tenant migration notes
- Existing tenants receive global routing rules at Phase 0; tenant-specific overrides can be added via
POST /v1/admin/routing-rules. - No patient-facing data migration is required; the gateway is a runtime pass-through.
- Tenants with DPIA for on-prem AI must be manually mapped to the
ON_PREMresidency routing rule.
7. Open questions
- Confirm backfill scope for pre-gateway AI-assisted notes (~5,000 notes across 3 pilot tenants estimated).
- DPIA approval timeline for Azure OpenAI (AF data-residency path).