Skip to main content

MIGRATION_PLAN — reporting-service

Sibling: DATA_MODEL · DEPLOYMENT_TOPOLOGY · SERVICE_READINESS

This document covers two horizons: (a) the initial green-field rollout of reporting-service, and (b) the policy for ongoing schema/event evolution.


1. Green-field rollout (Phase 1)

reporting-service is a new service. There is no legacy data to import. Rollout proceeds region-by-region, residency-by-residency, behind a tenant feature flag feature.reporting_v1.

1.1 Phasing

PhaseScopeExit criteria
0. FoundationCloud SQL schema, RLS, GCS buckets, Pub/Sub topics, OIDC bindingsAll SERVICE_READINESS §1-3 items green in dev
1. Internal canaryOne internal "tenant" runs the full set of operational templates daily7 d clean in dev + staging
2. Pilot tenants3 pilot hotels (1 per residency: AF, IN, KSA)Per-tenant SLO target met for 2 weeks
3. Regulatory adapters opt-inEnable AF police submission for AF pilots, KSA VAT for KSA pilots30 d zero missed-cutoff
4. General availabilityAll tenants flag-flipped per residency cohortRegion SLO + budget targets sustained

Rollback per phase is a flag flip + traffic drain to the prior revision (which retains the previous schema state).

1.2 Feature flag wiring

  • Flag owner: tenant-service per-tenant settings.
  • Effective check at request entry: if (!features.reportingV1) return 403 MELMASTOON.IAM.AUTHZ_DENIED.
  • Migrations run regardless of flag state; the flag gates traffic, not schema.

2. Schema evolution policy

Postgres migrations under db/migrations/<NNNN>_<name>.sql, applied via pnpm db:migrate. Rules:

  • Forward-only, additive by default.
  • Two-phase for renames or type changes:
    1. Add new column / table / index, keep old; backfill; dual-write; switch reads to new.
    2. After the next release window, drop old in a follow-up migration.
  • Never rename a column in a single migration in production.
  • Always provide a CHECK constraint or RLS policy in the same migration as a new tenant-scoped table (CI gate).
  • Backfills of existing tenants are scripted in db/backfills/ and run as one-shot Cloud Run Jobs with progress logging.

3. Event evolution policy

  • Subjects are versioned (v1, v2, …) per EVENT_SCHEMAS §11 versioning.
  • Additive changes within v1 allowed; breaking changes require a new v2 topic.
  • During transitions, both v1 and v2 are published in parallel for at least one quarter to give consumers time to upgrade.
  • Schemas are validated in CI; consumers run validateOrThrow() at the inbox boundary.

4. Coexistence with Ghasi-edTech legacy reporting

The Ghasi-edTech monorepo previously contained an early reporting prototype focused on the edtech vertical. For Melmastoon we are not importing that code or data. Lessons captured:

  • Template versioning must be part of v1 (the prototype lacked it; we adopted snapshot-on-run).
  • Renderer must be a port; the prototype hard-coded wkhtmltopdf and was hard to test.
  • Event naming must be aggregate-verb-tense (the prototype mixed report.create and created).

These lessons are encoded in DOMAIN_MODEL, APPLICATION_LOGIC, and EVENT_SCHEMAS.


5. Data backfill scenarios

ScenarioApproach
Tenant onboarded mid-quarter with historical data needsRun a one-shot batch using the standard RequestReportRunUseCase with runDate overrides; cap to last 90 d to avoid BQ cost spikes
Add a new platform-shared template0005_seed_platform_templates.sql extended; existing tenants pick it up automatically (template_id is platform-shared)
Re-render an existing run with a fixed templateCancel + re-run (new run id); old artifacts retained per retention class
Migrate object lock retention defaultsBucket-level change; existing locked objects keep their original lock until expiry

6. Region failover & residency moves

  • Region failover (same residency): Cloud SQL HA failover automatic; GCS multi-region default; reporting-api/worker autoscaled in standby.
  • Residency migration (e.g., tenant moves from AF to IN):
    1. Lock the tenant in source region (read-only).
    2. Export operational tables for the tenant via CSV → import into target region.
    3. Regulatory artifacts do not move — they remain in the source region's regulatory bucket per legal hold; an index entry in target region links by originalRegion.
    4. Re-emit tenant.region_changed.v1; reporting-service rewrites the tenant's stored region and resumes.
    5. Unlock in target.

A residency migration runbook lives at docs/runbooks/platform/residency-migration.md.


7. Removal/deprecation playbook

If a template version becomes obsolete:

  1. POST /api/v1/reports/templates/{id}:archive (creates audit + emits template.archived.v1).
  2. Existing scheduled runs pinned to that version continue until the schedule is updated.
  3. After the operator updates the schedule, the archived version is excluded from new runs but retained as historical record.

8. Open migration items

ItemOwnerTarget
Bulk-import historical occupancy from pilot tenantsReporting TL + Pilot successPhase 2
Add Pashto + Dari golden snapshots for top 8 templatesFrontend + ReportingPhase 2
Switch operational artifact CMEK to envelope-DEK patternPlatform SREPhase 4
Cutover from FakeRegulatoryAdapter to live AF police adapterCompliance + ReportingPhase 3

Cross-references: SERVICE_READINESS, SERVICE_RISK_REGISTER, DATA_MODEL §8, EVENT_SCHEMAS §11.