MIGRATION_PLAN — pricing-service
Sibling: DATA_MODEL · DEPLOYMENT_TOPOLOGY · SERVICE_READINESS
This document covers two distinct kinds of migration:
- Greenfield bring-up — initial production cutover of pricing-service into the platform.
- Schema/contract evolution — ongoing migrations of database schema, event subjects, REST contracts, and AI capabilities once the service is live.
It does NOT cover migrating from a third-party PMS pricing engine — that scope belongs to the per-tenant onboarding playbook in docs/operational/tenant-onboarding.md.
1. Greenfield bring-up (initial cutover)
Phase 0 — Pre-cutover (T-30 to T-0 days)
| Item | Owner | Done when |
|---|---|---|
Terraform module merged for pricing-public, pricing-admin, pricing-workers | Platform SRE | terraform plan is no-op |
Cloud SQL primary + 2 read replicas + DR replica provisioned in me-central2 / europe-west4 | Platform SRE | replica lag < 1 s steady |
| Memorystore HA instance provisioned | Platform SRE | health green |
| Pub/Sub topics + DLQs created with CMEK + per-topic IAM bindings | Platform SRE | gcloud pubsub topics list shows melmastoon.pricing.* |
API Gateway routes registered (/v1/pricing/*, /v1/admin/pricing/*) | API platform | smoke GET /v1/healthz returns 200 |
Image v1.0.0 built, signed, scanned, promoted to me-central2 registry | Service owner | Binary Authorization passes |
| All 17 service docs (this bundle) reviewed; ORR sign-off complete | Service owner | SERVICE_READINESS green |
| Synthetic checks configured | SRE | uptime probes passing |
| Initial seed: empty (production tenants onboard via tenant-service) | Service owner | DB schema + RLS verified |
Phase 1 — Dark launch (T-0)
- Deploy
pricing-publicandpricing-adminwithmin-instances=2. - Deploy
pricing-workersto GKE (outbox-relay, inbox-consumer, crons). - API Gateway routes are reachable but not wired to any tenant's BFF.
- Synthetic check tenant
tnt_PROD_SMOKEquotes against a single property to validate the path; runs every 60 s for 24 h.
Phase 2 — Tenant cutover (rolling)
For each tenant onboarded:
tenant-serviceprovisions config (currency, jurisdiction, locale).property-serviceregisters properties + room types.- Revenue Ops authors the tenant's first rate plans + rules through the admin API.
- Reservation BFF flips the feature flag
bff.use_pricing_service=truefor that tenant. - Monitor
pricing_quotes_created_total{tenant_id=…}andquote_latency_p99for 48 h.
Rollback: feature flag flip back to legacy pricing path; pricing-service keeps running idle.
Phase 3 — Steady state
- Drop the legacy pricing path from BFFs once all tenants have been on
pricing-servicefor ≥ 30 days with green SLOs. - Archive the legacy code repo.
2. Schema migrations (ongoing)
Principles
- Forward-only DDL; rollbacks ship as new forward migrations.
- Migrations are idempotent (
IF NOT EXISTS/IF EXISTSwhere Postgres allows). - Migrations that take long locks (
ALTER TABLE ... ADD COLUMN NOT NULL) are split into the standard 3-step expand-contract dance:- ADD column nullable.
- Backfill in batches with throttling.
- Set NOT NULL + drop old column in a later release.
- Destructive migrations (drop column/table) require an ADR, a 30-day grace window, and a paired data-migration plan.
- All migrations are reviewed by SRE before merge.
Tooling
drizzle-kitproduces migrations intoservices/pricing-service/migrations/.- File naming:
YYYYMMDDHHmmss__<short>.sql. - CI runs migrations against a fresh ephemeral database; tests must pass on the post-migration schema.
- Migration log is recorded in the
pricing.__drizzle_migrationstable; the service refuses to start if its expected version doesn't match the database version.
Deploy sequencing
- PR — new migration committed + tests pass.
- Dev / staging deploy — auto on merge to
main; migrations run before serving traffic via a one-shot Cloud Run Job. - Production deploy — manual approval; migrations run via the same one-shot Job; only after success does the new revision receive traffic.
- Rollback — Cloud Run revision rollback restores the previous binary; if the new migration is incompatible with the previous binary, the rollback playbook documents the data-fixup steps.
3. Event contract evolution
| Change | Procedure |
|---|---|
| Add optional field | ship in current vN subject; consumers MUST ignore unknown fields |
| Add required field | ship as vN+1 subject in parallel with vN; deprecate vN after all consumers migrate; remove only after a 90-day quiet period in production |
| Remove a field | ship vN+1 without it; downstream consumers update; remove vN after 90 days |
| Rename a field | model as remove + add (above); never repurpose semantics |
| Subject deprecation | Add deprecation notice in EVENT_SCHEMAS; update OpenAPI; add Slack #event-platform announcement; mark in @melmastoon/contracts package |
A breaking event change requires an ADR in docs/architecture/.
4. REST API evolution
- Backwards-compatible changes (new optional fields, new endpoints, new optional query params) ship under
/v1. - Breaking changes ship under
/v2. Both versions run in parallel until BFFs migrate (typically 90 days). Deprecation warnings injected asDeprecation:andSunset:headers on/v1responses. - OpenAPI spec is regenerated on every change;
@melmastoon/clients/pricingpackage is bumped. Consumers are auto-notified via the contracts dashboard.
5. AI capability evolution
- Prompt version bumps (
pricing/dynamic.v3→v4) are coordinated viaai-orchestrator-servicereleases. Pricing-service references the capability id, not the prompt; new prompt rolls out after a shadow comparison run shows acceptable distribution shift. - Model id bumps (e.g. Gemini-1.5-Pro → Gemini-2.0-Pro) require a 7-day shadow run with output stored under
dynamic_suggestions.shadow_results(off-by-default schema), then a coordinated cutover. - Adding a new AI capability (e.g.
pricing.promo_suggestion) follows the standard new-feature flow: ADR → schema → use case → contract test → release.
6. Open migrations / roadmap
| Item | Target | Notes |
|---|---|---|
| Add secondary FX provider integration | Q2 2026 | Reduces R-03 from §SERVICE_RISK_REGISTER |
Add pricing.promo_suggestion AI capability | Q3 2026 | Requires elasticity data warehouse |
Migrate price_quotes partitioning to weekly (from daily) | Q3 2026 | After 6 months of production traffic data |
Multi-region active-active for pricing-public | 2027 | Pending platform-wide architecture decision |
7. Migration runbooks
Per-migration runbooks live next to the migration in services/pricing-service/migrations/<file>.runbook.md for any migration tagged risk: medium or higher. Templates available at docs/operational/MIGRATION_RUNBOOK_TEMPLATE.md.