Terminology Service — Migration Plan
Status: populated Owner: TBD Last updated: 2026-04-18 Companion: Service Template · 03 platform-services
1. Migration Context
The terminology-service is a new shared platform service. Migrations cover:
- Initial licensed terminology data load (all tenants share global data).
- Tenant-scoped custom concept migration from legacy coded-value lists.
- Drug interaction / CDS data migration from any prior clinical decision support component.
2. Initial Licensed Terminology Load (ETL)
All new deployments must run the terminology ETL pipeline before the service is ready to accept traffic. The ETL runs as a Kubernetes Job.
| Phase | Action | Tool |
|---|---|---|
| 1. Data procurement | Obtain licensed data files (LOINC CSV, SNOMED RF2, RxNorm RRF) from respective sources | Manual / license portal |
| 2. Data staging | Upload to secured object storage bucket (terminology-etl-input/{env}/) | AWS S3 / compatible |
| 3. Schema migration | Run Drizzle migrations: pnpm db:migrate | CI/CD pipeline |
| 4. ETL import | Execute Kubernetes ETL jobs in order: LOINC → SNOMED → RxNorm → ICD-10 | kubectl apply -f kubernetes/jobs/terminology-etl/ |
| 5. Drug data import | Load drug interaction, drug class, and contraindication data from licensed clinical knowledge source | ETL job: rxnorm-cds-import.job.yaml |
| 6. Value set seeding | Seed standard FHIR value sets (obs-interpretation, condition-category, etc.) | pnpm db:seed:valuesets |
| 7. Readiness verification | Check GET /health returns terminology_data: loaded; spot-check $lookup for known codes | Manual / smoke test |
3. Tenant Custom Concept Migration
For tenants with existing facility-specific coded value lists:
| Step | Action |
|---|---|
| Export | Export legacy custom codes to CSV: id, system, code, display, definition |
| Validate | Run scripts/migration/terminology/validate-custom-concepts.ts — checks for duplicates and invalid system URIs |
| Import | POST /internal/terminology/import with system = urn:ghasi:tenant:{tenantId} |
| Verify | Spot-check GET /v1/terminology/search?system=urn:ghasi:tenant:{tenantId}&query=... |
Migration script: scripts/migration/terminology/import-tenant-concepts.ts
The import is idempotent (upsert on tenant_id + system + code). Pre-existing global concepts with the same code are not overwritten.
4. Drug Interaction Data Migration
If a prior clinical system had drug interaction data in a proprietary schema, migrate it as follows:
- Export interactions to CSV:
drug1_code, drug2_code, severity, description - Normalize: ensure
drug1_code <= drug2_code(canonical pair ordering). - Import via ETL:
pnpm db:seed:interactions --file interactions.csv - Validate severity coverage for CONTRAINDICATED and HIGH pairs with a known test set.
5. Rolling Terminology Updates
After initial deployment, terminology datasets receive periodic updates (LOINC releases annually, SNOMED CT twice yearly). The update process:
- Upload new licensed data files to staging bucket.
- Re-run the relevant ETL Kubernetes Job.
- ETL performs upsert (no concept deletions — deactivation only).
TERMINOLOGY.dataset.updatedevent is published; downstream consumers may invalidate cached concept data.- Monitor concept count metrics post-import; verify no unexpected drops.
6. Rollback Plan
| Scenario | Rollback action |
|---|---|
| ETL import corrupts concept data | Restore PostgreSQL from pre-import snapshot; re-run corrected ETL |
| Schema migration failure | Drizzle supports down migrations; run pnpm db:migrate:down to previous version |
| New version causes lookup regressions | Roll back Kubernetes deployment to previous image; ETL data is independent of service version |
ETL jobs run in a transaction where possible (PostgreSQL COPY with ROLLBACK on error). Partial imports that abort do not leave orphaned records.