analytics-service

Bounded Context: Analytics (Supporting) · Owner: PMS-Data · Phase: 1 · Storage: BigQuery (raw + curated layers) + Cloud SQL Postgres (metric/projection metadata, dashboards) + Memorystore Redis (hot reads) · Bundle: services/analytics-service/

analytics-service owns the read-side analytics pipeline for Ghasi Melmastoon. It consumes every platform domain event via wildcard subscription, lands raw events in BigQuery (events_raw.*), runs scheduled aggregation jobs that produce per-domain curated fact and dimension tables (fact_reservation, fact_payment, fact_housekeeping_task, fact_lock_action, dim_property, dim_room_type, dim_tenant, dim_calendar), and exposes a Query API that powers dashboard widgets in bff-backoffice-service (Electron desktop) and Looker Studio workbooks for tenant-admin power users.

It is explicitly not a reporting service: no PDF/Excel/CSV rendering happens here — that is reporting-service's job, which reads our curated tables. It is explicitly not an AI inference service: model calls go through ai-orchestrator-service, but we are the source of the aggregated signals AI uses (occupancy curves, channel mix, demand forecasts) and the writeback target for forecast outputs.

Purpose

Single source of truth for read-side analytics across all bounded contexts.
Platform-wide metric definitions registry (occupancy, ADR, RevPAR, ALOS, cancellation rate, no-show rate, conversion meta→book, channel mix, AI-suggestion-acceptance rate).
Per-tenant strict data isolation in BigQuery via partition + authorized view RLS.
Tenant-admin dashboard authoring with widget composition.
Backfill from event archive when projections evolve or are reseeded.

Key responsibilities

Pub/Sub → BigQuery raw event sink: managed Pub/Sub-to-BigQuery subscriptions for melmastoon.* topics (operational + regulated retention classes).
Curated layer ETL: scheduled jobs (Cloud Workflows + lightweight Cloud Run Jobs; Cloud Composer reserved for complex DAGs) producing curated tables and incremental MERGEs.
Metric definitions: versioned registry; each metric has SQL template, dimension keys, freshness SLO.
Query API: curated query endpoint, parameter binding, per-tenant authorized views.
Dashboards & widgets: tenant-admin CRUD; widget catalog (KPI tile, time-series, breakdown, funnel, heatmap).
Data quality checks: row-count drift, schema drift, freshness, null-rate, distinct-count guardrails.
AI signal pipeline: publishes aggregated signals (metric.computed.v1) consumed by ai-orchestrator-service; writes back forecast results into fact_demand_forecast.

Owned aggregates (high-level)

AnalyticsEvent (raw landing in BigQuery, immutable), Projection (curated table definition), MetricDefinition, Dashboard, Widget, Query (saved query), ETLJob, DataQualityCheck. Detailed model: services/analytics-service/DOMAIN_MODEL.md.

Public APIs (selection)

POST  /api/v1/analytics/queries:run              # ad-hoc query against curated layer
POST  /api/v1/analytics/metrics/{key}:compute    # compute one metric for a window
GET   /api/v1/analytics/metrics                  # list metric definitions
POST  /api/v1/analytics/dashboards               # create dashboard (tenant admin)
PUT   /api/v1/analytics/dashboards/{id}          # update
GET   /api/v1/analytics/dashboards/{id}          # read
POST  /api/v1/analytics/dashboards/{id}/widgets  # add widget
GET   /api/v1/analytics/widgets/{id}/data        # render widget data
POST  /api/v1/analytics/projections/{key}:refresh# manual refresh trigger (admin)
GET   /api/v1/analytics/projections              # list projections + freshness
GET   /api/v1/analytics/data-quality             # latest DQ results
GET   /api/v1/analytics/etl/jobs/{id}            # ETL job status
GET   /internal/sync/pull                        # KPI snapshot pull (sync-service)
POST  /internal/scheduler/etl                    # Cloud Scheduler / Workflows trigger

Full contracts in services/analytics-service/API_CONTRACTS.md.

Top events published

melmastoon.analytics.projection.refreshed.v1
melmastoon.analytics.projection.failed.v1
melmastoon.analytics.metric.computed.v1
melmastoon.analytics.dashboard.created.v1
melmastoon.analytics.dashboard.updated.v1
melmastoon.analytics.dashboard.shared.v1
melmastoon.analytics.query.executed.v1
melmastoon.analytics.etl.started.v1
melmastoon.analytics.etl.completed.v1
melmastoon.analytics.etl.failed.v1
melmastoon.analytics.data_quality.alert.v1

Top events consumed

All platform events via wildcard subscription melmastoon.* (managed Pub/Sub-to-BigQuery sink for raw landing, plus a thin worker for inbox dedupe metrics).
melmastoon.tenant.deleted.v1 — cascade purge from BigQuery (operational classes) + anonymization (regulated classes).
melmastoon.tenant.region_changed.v1 — adjust per-tenant authorized view binding.
melmastoon.ai.forecast.produced.v1 — write forecast results back into fact_demand_forecast.

Upstream / downstream

Upstream: every emitting service in the platform (reservation, billing, housekeeping, inventory, staff, iam, lock, channel, etc.) and tenant-service for residency.
Downstream: reporting-service (reads curated tables); bff-backoffice-service (dashboard widgets via the Query API); ai-orchestrator-service (consumes computed metrics, writes forecasts back); Looker Studio (tenant-admin power users via authorized views).

Non-functional requirements

Freshness: curated fact tables refresh ≤ 5 min after upstream event for hot domains (reservation, payment, housekeeping); ≤ 15 min for cold (lock, audit) — per OBSERVABILITY.
Query API p95: ≤ 800 ms for cached widget reads, ≤ 3 s for ad-hoc curated queries (≤ 1 GB scanned).
Data quality: every curated table has row-count, freshness, null-rate, and distinct-count checks; alert fan-out on breach.
Tenant isolation: BigQuery authorized views enforce row-level access via tenant_id partition + view filter; no shared service account ever issues SQL with WHERE tenant_id = … left to the caller.
Cost: per-tenant slot reservation + per-query byte cap; daily budget alarm.

Detailed bundle: services/analytics-service/.

Purpose​

Key responsibilities​

Owned aggregates (high-level)​

Public APIs (selection)​

Top events published​

Top events consumed​

Upstream / downstream​

Non-functional requirements​