Terminology Service — Migration Plan

Status: populated Owner: TBD Last updated: 2026-04-18 Companion: Service Template · 03 platform-services

1. Migration Context

The terminology-service is a new shared platform service. Migrations cover:

Initial licensed terminology data load (all tenants share global data).
Tenant-scoped custom concept migration from legacy coded-value lists.
Drug interaction / CDS data migration from any prior clinical decision support component.

All new deployments must run the terminology ETL pipeline before the service is ready to accept traffic. The ETL runs as a Kubernetes Job.

Phase	Action	Tool
1. Data procurement	Obtain licensed data files (LOINC CSV, SNOMED RF2, RxNorm RRF) from respective sources	Manual / license portal
2. Data staging	Upload to secured object storage bucket (`terminology-etl-input/{env}/`)	AWS S3 / compatible
3. Schema migration	Run Drizzle migrations: `pnpm db:migrate`	CI/CD pipeline
4. ETL import	Execute Kubernetes ETL jobs in order: LOINC → SNOMED → RxNorm → ICD-10	`kubectl apply -f kubernetes/jobs/terminology-etl/`
5. Drug data import	Load drug interaction, drug class, and contraindication data from licensed clinical knowledge source	ETL job: `rxnorm-cds-import.job.yaml`
6. Value set seeding	Seed standard FHIR value sets (obs-interpretation, condition-category, etc.)	`pnpm db:seed:valuesets`
7. Readiness verification	Check `GET /health` returns `terminology_data: loaded`; spot-check `$lookup` for known codes	Manual / smoke test

For tenants with existing facility-specific coded value lists:

Step	Action
Export	Export legacy custom codes to CSV: `id, system, code, display, definition`
Validate	Run `scripts/migration/terminology/validate-custom-concepts.ts` — checks for duplicates and invalid system URIs
Import	`POST /internal/terminology/import` with `system = urn:ghasi:tenant:{tenantId}`
Verify	Spot-check `GET /v1/terminology/search?system=urn:ghasi:tenant:{tenantId}&query=...`

Migration script: scripts/migration/terminology/import-tenant-concepts.ts

The import is idempotent (upsert on tenant_id + system + code). Pre-existing global concepts with the same code are not overwritten.

If a prior clinical system had drug interaction data in a proprietary schema, migrate it as follows:

Export interactions to CSV: drug1_code, drug2_code, severity, description
Normalize: ensure drug1_code <= drug2_code (canonical pair ordering).
Import via ETL: pnpm db:seed:interactions --file interactions.csv
Validate severity coverage for CONTRAINDICATED and HIGH pairs with a known test set.

After initial deployment, terminology datasets receive periodic updates (LOINC releases annually, SNOMED CT twice yearly). The update process:

Upload new licensed data files to staging bucket.
Re-run the relevant ETL Kubernetes Job.
ETL performs upsert (no concept deletions — deactivation only).
TERMINOLOGY.dataset.updated event is published; downstream consumers may invalidate cached concept data.
Monitor concept count metrics post-import; verify no unexpected drops.

Scenario	Rollback action
ETL import corrupts concept data	Restore PostgreSQL from pre-import snapshot; re-run corrected ETL
Schema migration failure	Drizzle supports down migrations; run `pnpm db:migrate:down` to previous version
New version causes lookup regressions	Roll back Kubernetes deployment to previous image; ETL data is independent of service version

ETL jobs run in a transaction where possible (PostgreSQL COPY with ROLLBACK on error). Partial imports that abort do not leave orphaned records.