analytics-service
Bounded Context: Analytics (Supporting) · Owner: PMS-Data · Phase: 1 · Storage: BigQuery (raw + curated layers) + Cloud SQL Postgres (metric/projection metadata, dashboards) + Memorystore Redis (hot reads) · Bundle: services/analytics-service/
analytics-service owns the read-side analytics pipeline for Ghasi Melmastoon. It consumes every platform domain event via wildcard subscription, lands raw events in BigQuery (events_raw.*), runs scheduled aggregation jobs that produce per-domain curated fact and dimension tables (fact_reservation, fact_payment, fact_housekeeping_task, fact_lock_action, dim_property, dim_room_type, dim_tenant, dim_calendar), and exposes a Query API that powers dashboard widgets in bff-backoffice-service (Electron desktop) and Looker Studio workbooks for tenant-admin power users.
It is explicitly not a reporting service: no PDF/Excel/CSV rendering happens here — that is reporting-service's job, which reads our curated tables. It is explicitly not an AI inference service: model calls go through ai-orchestrator-service, but we are the source of the aggregated signals AI uses (occupancy curves, channel mix, demand forecasts) and the writeback target for forecast outputs.
Purpose
- Single source of truth for read-side analytics across all bounded contexts.
- Platform-wide metric definitions registry (occupancy, ADR, RevPAR, ALOS, cancellation rate, no-show rate, conversion meta→book, channel mix, AI-suggestion-acceptance rate).
- Per-tenant strict data isolation in BigQuery via partition + authorized view RLS.
- Tenant-admin dashboard authoring with widget composition.
- Backfill from event archive when projections evolve or are reseeded.
Key responsibilities
- Pub/Sub → BigQuery raw event sink: managed Pub/Sub-to-BigQuery subscriptions for
melmastoon.*topics (operational + regulated retention classes). - Curated layer ETL: scheduled jobs (Cloud Workflows + lightweight Cloud Run Jobs; Cloud Composer reserved for complex DAGs) producing curated tables and incremental MERGEs.
- Metric definitions: versioned registry; each metric has SQL template, dimension keys, freshness SLO.
- Query API: curated query endpoint, parameter binding, per-tenant authorized views.
- Dashboards & widgets: tenant-admin CRUD; widget catalog (KPI tile, time-series, breakdown, funnel, heatmap).
- Data quality checks: row-count drift, schema drift, freshness, null-rate, distinct-count guardrails.
- AI signal pipeline: publishes aggregated signals (
metric.computed.v1) consumed byai-orchestrator-service; writes back forecast results intofact_demand_forecast.
Owned aggregates (high-level)
AnalyticsEvent (raw landing in BigQuery, immutable), Projection (curated table definition), MetricDefinition, Dashboard, Widget, Query (saved query), ETLJob, DataQualityCheck. Detailed model: services/analytics-service/DOMAIN_MODEL.md.
Public APIs (selection)
POST /api/v1/analytics/queries:run # ad-hoc query against curated layer
POST /api/v1/analytics/metrics/{key}:compute # compute one metric for a window
GET /api/v1/analytics/metrics # list metric definitions
POST /api/v1/analytics/dashboards # create dashboard (tenant admin)
PUT /api/v1/analytics/dashboards/{id} # update
GET /api/v1/analytics/dashboards/{id} # read
POST /api/v1/analytics/dashboards/{id}/widgets # add widget
GET /api/v1/analytics/widgets/{id}/data # render widget data
POST /api/v1/analytics/projections/{key}:refresh# manual refresh trigger (admin)
GET /api/v1/analytics/projections # list projections + freshness
GET /api/v1/analytics/data-quality # latest DQ results
GET /api/v1/analytics/etl/jobs/{id} # ETL job status
GET /internal/sync/pull # KPI snapshot pull (sync-service)
POST /internal/scheduler/etl # Cloud Scheduler / Workflows trigger
Full contracts in services/analytics-service/API_CONTRACTS.md.
Top events published
melmastoon.analytics.projection.refreshed.v1
melmastoon.analytics.projection.failed.v1
melmastoon.analytics.metric.computed.v1
melmastoon.analytics.dashboard.created.v1
melmastoon.analytics.dashboard.updated.v1
melmastoon.analytics.dashboard.shared.v1
melmastoon.analytics.query.executed.v1
melmastoon.analytics.etl.started.v1
melmastoon.analytics.etl.completed.v1
melmastoon.analytics.etl.failed.v1
melmastoon.analytics.data_quality.alert.v1
Top events consumed
- All platform events via wildcard subscription
melmastoon.*(managed Pub/Sub-to-BigQuery sink for raw landing, plus a thin worker for inbox dedupe metrics). melmastoon.tenant.deleted.v1— cascade purge from BigQuery (operational classes) + anonymization (regulated classes).melmastoon.tenant.region_changed.v1— adjust per-tenant authorized view binding.melmastoon.ai.forecast.produced.v1— write forecast results back intofact_demand_forecast.
Upstream / downstream
- Upstream: every emitting service in the platform (reservation, billing, housekeeping, inventory, staff, iam, lock, channel, etc.) and
tenant-servicefor residency. - Downstream:
reporting-service(reads curated tables);bff-backoffice-service(dashboard widgets via the Query API);ai-orchestrator-service(consumes computed metrics, writes forecasts back); Looker Studio (tenant-admin power users via authorized views).
Non-functional requirements
- Freshness: curated fact tables refresh ≤ 5 min after upstream event for hot domains (reservation, payment, housekeeping); ≤ 15 min for cold (lock, audit) — per OBSERVABILITY.
- Query API p95: ≤ 800 ms for cached widget reads, ≤ 3 s for ad-hoc curated queries (≤ 1 GB scanned).
- Data quality: every curated table has row-count, freshness, null-rate, and distinct-count checks; alert fan-out on breach.
- Tenant isolation: BigQuery authorized views enforce row-level access via
tenant_idpartition + view filter; no shared service account ever issues SQL withWHERE tenant_id = …left to the caller. - Cost: per-tenant slot reservation + per-query byte cap; daily budget alarm.
Detailed bundle: services/analytics-service/.