Skip to main content

MIGRATION_PLAN — pricing-service

Sibling: DATA_MODEL · DEPLOYMENT_TOPOLOGY · SERVICE_READINESS

This document covers two distinct kinds of migration:

  1. Greenfield bring-up — initial production cutover of pricing-service into the platform.
  2. Schema/contract evolution — ongoing migrations of database schema, event subjects, REST contracts, and AI capabilities once the service is live.

It does NOT cover migrating from a third-party PMS pricing engine — that scope belongs to the per-tenant onboarding playbook in docs/operational/tenant-onboarding.md.


1. Greenfield bring-up (initial cutover)

Phase 0 — Pre-cutover (T-30 to T-0 days)

ItemOwnerDone when
Terraform module merged for pricing-public, pricing-admin, pricing-workersPlatform SREterraform plan is no-op
Cloud SQL primary + 2 read replicas + DR replica provisioned in me-central2 / europe-west4Platform SREreplica lag < 1 s steady
Memorystore HA instance provisionedPlatform SREhealth green
Pub/Sub topics + DLQs created with CMEK + per-topic IAM bindingsPlatform SREgcloud pubsub topics list shows melmastoon.pricing.*
API Gateway routes registered (/v1/pricing/*, /v1/admin/pricing/*)API platformsmoke GET /v1/healthz returns 200
Image v1.0.0 built, signed, scanned, promoted to me-central2 registryService ownerBinary Authorization passes
All 17 service docs (this bundle) reviewed; ORR sign-off completeService ownerSERVICE_READINESS green
Synthetic checks configuredSREuptime probes passing
Initial seed: empty (production tenants onboard via tenant-service)Service ownerDB schema + RLS verified

Phase 1 — Dark launch (T-0)

  • Deploy pricing-public and pricing-admin with min-instances=2.
  • Deploy pricing-workers to GKE (outbox-relay, inbox-consumer, crons).
  • API Gateway routes are reachable but not wired to any tenant's BFF.
  • Synthetic check tenant tnt_PROD_SMOKE quotes against a single property to validate the path; runs every 60 s for 24 h.

Phase 2 — Tenant cutover (rolling)

For each tenant onboarded:

  1. tenant-service provisions config (currency, jurisdiction, locale).
  2. property-service registers properties + room types.
  3. Revenue Ops authors the tenant's first rate plans + rules through the admin API.
  4. Reservation BFF flips the feature flag bff.use_pricing_service=true for that tenant.
  5. Monitor pricing_quotes_created_total{tenant_id=…} and quote_latency_p99 for 48 h.

Rollback: feature flag flip back to legacy pricing path; pricing-service keeps running idle.

Phase 3 — Steady state

  • Drop the legacy pricing path from BFFs once all tenants have been on pricing-service for ≥ 30 days with green SLOs.
  • Archive the legacy code repo.

2. Schema migrations (ongoing)

Principles

  • Forward-only DDL; rollbacks ship as new forward migrations.
  • Migrations are idempotent (IF NOT EXISTS / IF EXISTS where Postgres allows).
  • Migrations that take long locks (ALTER TABLE ... ADD COLUMN NOT NULL) are split into the standard 3-step expand-contract dance:
    1. ADD column nullable.
    2. Backfill in batches with throttling.
    3. Set NOT NULL + drop old column in a later release.
  • Destructive migrations (drop column/table) require an ADR, a 30-day grace window, and a paired data-migration plan.
  • All migrations are reviewed by SRE before merge.

Tooling

  • drizzle-kit produces migrations into services/pricing-service/migrations/.
  • File naming: YYYYMMDDHHmmss__<short>.sql.
  • CI runs migrations against a fresh ephemeral database; tests must pass on the post-migration schema.
  • Migration log is recorded in the pricing.__drizzle_migrations table; the service refuses to start if its expected version doesn't match the database version.

Deploy sequencing

  1. PR — new migration committed + tests pass.
  2. Dev / staging deploy — auto on merge to main; migrations run before serving traffic via a one-shot Cloud Run Job.
  3. Production deploy — manual approval; migrations run via the same one-shot Job; only after success does the new revision receive traffic.
  4. Rollback — Cloud Run revision rollback restores the previous binary; if the new migration is incompatible with the previous binary, the rollback playbook documents the data-fixup steps.

3. Event contract evolution

ChangeProcedure
Add optional fieldship in current vN subject; consumers MUST ignore unknown fields
Add required fieldship as vN+1 subject in parallel with vN; deprecate vN after all consumers migrate; remove only after a 90-day quiet period in production
Remove a fieldship vN+1 without it; downstream consumers update; remove vN after 90 days
Rename a fieldmodel as remove + add (above); never repurpose semantics
Subject deprecationAdd deprecation notice in EVENT_SCHEMAS; update OpenAPI; add Slack #event-platform announcement; mark in @melmastoon/contracts package

A breaking event change requires an ADR in docs/architecture/.


4. REST API evolution

  • Backwards-compatible changes (new optional fields, new endpoints, new optional query params) ship under /v1.
  • Breaking changes ship under /v2. Both versions run in parallel until BFFs migrate (typically 90 days). Deprecation warnings injected as Deprecation: and Sunset: headers on /v1 responses.
  • OpenAPI spec is regenerated on every change; @melmastoon/clients/pricing package is bumped. Consumers are auto-notified via the contracts dashboard.

5. AI capability evolution

  • Prompt version bumps (pricing/dynamic.v3v4) are coordinated via ai-orchestrator-service releases. Pricing-service references the capability id, not the prompt; new prompt rolls out after a shadow comparison run shows acceptable distribution shift.
  • Model id bumps (e.g. Gemini-1.5-Pro → Gemini-2.0-Pro) require a 7-day shadow run with output stored under dynamic_suggestions.shadow_results (off-by-default schema), then a coordinated cutover.
  • Adding a new AI capability (e.g. pricing.promo_suggestion) follows the standard new-feature flow: ADR → schema → use case → contract test → release.

6. Open migrations / roadmap

ItemTargetNotes
Add secondary FX provider integrationQ2 2026Reduces R-03 from §SERVICE_RISK_REGISTER
Add pricing.promo_suggestion AI capabilityQ3 2026Requires elasticity data warehouse
Migrate price_quotes partitioning to weekly (from daily)Q3 2026After 6 months of production traffic data
Multi-region active-active for pricing-public2027Pending platform-wide architecture decision

7. Migration runbooks

Per-migration runbooks live next to the migration in services/pricing-service/migrations/<file>.runbook.md for any migration tagged risk: medium or higher. Templates available at docs/operational/MIGRATION_RUNBOOK_TEMPLATE.md.