Skip to main content

analytics-service

Bounded Context: Analytics (Supporting) · Owner: PMS-Data · Phase: 1 · Storage: BigQuery (raw + curated layers) + Cloud SQL Postgres (metric/projection metadata, dashboards) + Memorystore Redis (hot reads) · Bundle: services/analytics-service/

analytics-service owns the read-side analytics pipeline for Ghasi Melmastoon. It consumes every platform domain event via wildcard subscription, lands raw events in BigQuery (events_raw.*), runs scheduled aggregation jobs that produce per-domain curated fact and dimension tables (fact_reservation, fact_payment, fact_housekeeping_task, fact_lock_action, dim_property, dim_room_type, dim_tenant, dim_calendar), and exposes a Query API that powers dashboard widgets in bff-backoffice-service (Electron desktop) and Looker Studio workbooks for tenant-admin power users.

It is explicitly not a reporting service: no PDF/Excel/CSV rendering happens here — that is reporting-service's job, which reads our curated tables. It is explicitly not an AI inference service: model calls go through ai-orchestrator-service, but we are the source of the aggregated signals AI uses (occupancy curves, channel mix, demand forecasts) and the writeback target for forecast outputs.


Purpose

  • Single source of truth for read-side analytics across all bounded contexts.
  • Platform-wide metric definitions registry (occupancy, ADR, RevPAR, ALOS, cancellation rate, no-show rate, conversion meta→book, channel mix, AI-suggestion-acceptance rate).
  • Per-tenant strict data isolation in BigQuery via partition + authorized view RLS.
  • Tenant-admin dashboard authoring with widget composition.
  • Backfill from event archive when projections evolve or are reseeded.

Key responsibilities

  • Pub/Sub → BigQuery raw event sink: managed Pub/Sub-to-BigQuery subscriptions for melmastoon.* topics (operational + regulated retention classes).
  • Curated layer ETL: scheduled jobs (Cloud Workflows + lightweight Cloud Run Jobs; Cloud Composer reserved for complex DAGs) producing curated tables and incremental MERGEs.
  • Metric definitions: versioned registry; each metric has SQL template, dimension keys, freshness SLO.
  • Query API: curated query endpoint, parameter binding, per-tenant authorized views.
  • Dashboards & widgets: tenant-admin CRUD; widget catalog (KPI tile, time-series, breakdown, funnel, heatmap).
  • Data quality checks: row-count drift, schema drift, freshness, null-rate, distinct-count guardrails.
  • AI signal pipeline: publishes aggregated signals (metric.computed.v1) consumed by ai-orchestrator-service; writes back forecast results into fact_demand_forecast.

Owned aggregates (high-level)

AnalyticsEvent (raw landing in BigQuery, immutable), Projection (curated table definition), MetricDefinition, Dashboard, Widget, Query (saved query), ETLJob, DataQualityCheck. Detailed model: services/analytics-service/DOMAIN_MODEL.md.


Public APIs (selection)

POST /api/v1/analytics/queries:run # ad-hoc query against curated layer
POST /api/v1/analytics/metrics/{key}:compute # compute one metric for a window
GET /api/v1/analytics/metrics # list metric definitions
POST /api/v1/analytics/dashboards # create dashboard (tenant admin)
PUT /api/v1/analytics/dashboards/{id} # update
GET /api/v1/analytics/dashboards/{id} # read
POST /api/v1/analytics/dashboards/{id}/widgets # add widget
GET /api/v1/analytics/widgets/{id}/data # render widget data
POST /api/v1/analytics/projections/{key}:refresh# manual refresh trigger (admin)
GET /api/v1/analytics/projections # list projections + freshness
GET /api/v1/analytics/data-quality # latest DQ results
GET /api/v1/analytics/etl/jobs/{id} # ETL job status
GET /internal/sync/pull # KPI snapshot pull (sync-service)
POST /internal/scheduler/etl # Cloud Scheduler / Workflows trigger

Full contracts in services/analytics-service/API_CONTRACTS.md.


Top events published

melmastoon.analytics.projection.refreshed.v1
melmastoon.analytics.projection.failed.v1
melmastoon.analytics.metric.computed.v1
melmastoon.analytics.dashboard.created.v1
melmastoon.analytics.dashboard.updated.v1
melmastoon.analytics.dashboard.shared.v1
melmastoon.analytics.query.executed.v1
melmastoon.analytics.etl.started.v1
melmastoon.analytics.etl.completed.v1
melmastoon.analytics.etl.failed.v1
melmastoon.analytics.data_quality.alert.v1

Top events consumed

  • All platform events via wildcard subscription melmastoon.* (managed Pub/Sub-to-BigQuery sink for raw landing, plus a thin worker for inbox dedupe metrics).
  • melmastoon.tenant.deleted.v1 — cascade purge from BigQuery (operational classes) + anonymization (regulated classes).
  • melmastoon.tenant.region_changed.v1 — adjust per-tenant authorized view binding.
  • melmastoon.ai.forecast.produced.v1 — write forecast results back into fact_demand_forecast.

Upstream / downstream

  • Upstream: every emitting service in the platform (reservation, billing, housekeeping, inventory, staff, iam, lock, channel, etc.) and tenant-service for residency.
  • Downstream: reporting-service (reads curated tables); bff-backoffice-service (dashboard widgets via the Query API); ai-orchestrator-service (consumes computed metrics, writes forecasts back); Looker Studio (tenant-admin power users via authorized views).

Non-functional requirements

  • Freshness: curated fact tables refresh ≤ 5 min after upstream event for hot domains (reservation, payment, housekeeping); ≤ 15 min for cold (lock, audit) — per OBSERVABILITY.
  • Query API p95:800 ms for cached widget reads, ≤ 3 s for ad-hoc curated queries (≤ 1 GB scanned).
  • Data quality: every curated table has row-count, freshness, null-rate, and distinct-count checks; alert fan-out on breach.
  • Tenant isolation: BigQuery authorized views enforce row-level access via tenant_id partition + view filter; no shared service account ever issues SQL with WHERE tenant_id = … left to the caller.
  • Cost: per-tenant slot reservation + per-query byte cap; daily budget alarm.

Detailed bundle: services/analytics-service/.