Skip to main content

housekeeping-service — MIGRATION_PLAN

Schema migrations: node-pg-migrate, expand → contract in two PRs. Data migrations: Cloud Run Jobs with throttling. Onboarding importer: optional CSV ingest for legacy housekeeping logs from incumbent PMS.


1. Schema migration policy

  • One PR per logical change. Numbered files in migrations/YYYYMMDDHHMM_<slug>.sql — with both up and down blocks.
  • Expand → Contract in two PRs for any breaking change:
    1. Expand: add new column/table/index in a backward-compatible way. Code reads from old, dual-writes new. Deploy.
    2. Backfill: run a Cloud Run Job to populate the new shape.
    3. Contract: flip code to read from new; remove old after one deploy with no fall-back.
  • Migrations run as a Cloud Run Job before traffic split during release. Job uses a dedicated service account with cloudsql.client only (no BYPASSRLS).
  • Migrations must be idempotent (IF NOT EXISTS, ON CONFLICT DO NOTHING) so partial failure replays cleanly.
  • Every migration is exercised in CI on a fresh DB and in a roll-back test (up then down then up).

2. Partition management

  • housekeeping_tasks is partitioned by created_at monthly via pg_partman.
  • Maintenance Cloud Run Job runs weekly and:
    • Creates partitions for the next 3 months if missing.
    • Detaches partitions older than 18 months, exports to GCS Parquet via the standard archive job, drops them.
  • Index changes on the partitioned table apply globally; expand-contract still required.

3. Index changes

  • New indexes created with CONCURRENTLY to avoid table locks.
  • partition-pruning.spec.ts baseline updated whenever a hot query changes; CI rejects regressions.

4. Outbox / inbox migrations

  • outbox and inbox schemas are append-only-friendly; migrations only add fields.
  • During payload-schema bumps (vN → vN+1):
    • Producer dual-publishes for ≥ 30 days (configurable).
    • EVENT_SCHEMAS.md §5 policy applies.

5. Data migrations and backfills

Standard pattern:

  1. New column added (expand step).
  2. Cloud Run Job housekeeping-backfill-<slug> reads in batches of 1000 ordered by (tenant_id, id), throttled to 1k rows/s default.
  3. Job is resumable (writes a checkpoint row to housekeeping.maintenance_state).
  4. Verification query asserts 0 rows with NULL new column before contract step.

6. Onboarding importer (optional)

For tenants migrating from an existing PMS with usable housekeeping history:

  • CSV format: task_id_legacy,date,room_number,kind,assignee_legacy,started_at,completed_at,outcome,note.
  • Importer Cloud Run Job housekeeping-onboarding-import:
    1. Reads CSV from GCS bucket gs://melmastoon-onboarding/<tenantId>/housekeeping/.
    2. Resolves room_numberroom_id via property-service; rejects unknown rows to a _rejects.csv.
    3. Resolves assignee_legacystaff_id via staff-service mapping.
    4. Inserts housekeeping_tasks rows with source='import', created_at = original date, status final-only (completed/failed/cancelled).
    5. Skips outbox emission (historical, not live).
    6. Records audit row.
  • Limit: 50k rows/tenant per onboarding to avoid skewing dashboards; older history archived to GCS as Parquet only.