Skip to main content

SECURITY_MODEL — analytics-service

Sibling: DATA_MODEL · API_CONTRACTS · APPLICATION_LOGIC · platform anchors: docs/07 Security/Compliance/Tenancy, docs/architecture/ADR-0002


1. Authentication

CallerMechanism
BFF (bff-backoffice-service, bff-guest-portal-service)Internal mTLS + signed JWT carrying userId, tenantId, propertyAccess[], roles[]
sync-service/internal/sync/*Internal mTLS + service JWT (aud=analytics-service)
ai-orchestrator-service → BigQuery (curated views, writeback)Workload Identity-bound GSA with scoped roles
Cloud Scheduler / Workflows → /internal/scheduler/etlOIDC token, audience pinned, SA allow-list
Pub/Sub push subscriptionsOIDC verified, audience pinned
Looker Studio Community ConnectorPer-tenant short-lived embed token (signed JWT) verified against tenant_views.access_bindings

End-user JWTs are issued by iam-service (15 min access, 8 h refresh) and validated against the platform JWKS. Failures return MELMASTOON.IAM.AUTH_INVALID.


2. Authorization

RBAC + ABAC (07 §4). Permissions specific to analytics:

PermissionAllowsDefault roles
analytics.viewerRead dashboards, widgets, metric data within property scopestaff, manager, tenant.admin, tenant.owner
analytics.authorCreate/update dashboards, widgets, saved queries, run ad-hoc curated queriesmanager, tenant.admin, tenant.owner
analytics.adminPublish metric definitions, projections, manage DQ checks, manage budgetstenant.admin, tenant.owner
analytics.looker_shareIssue Looker Studio embed tokenstenant.admin, tenant.owner
analytics.budget_adminView per-tenant byte/cost projectionstenant.owner

ABAC overlays:

  • Property scope. Every widget filter is intersected with jwt.propertyAccess[]. Empty intersection → MELMASTOON.ANALYTICS.PROPERTY_SCOPE_VIOLATION.
  • Dashboard scope vs property. A scope='property' dashboard is invisible to users without that property in scope.
  • Saved query ownership. Mutations on saved queries restricted to owner unless caller has analytics.admin.

Decisions evaluated at the application boundary and re-evaluated at the repository layer (defense-in-depth). Denials emit analytics.access_denied audit events.


3. Tenant isolation

Three independently sufficient layers:

  1. JWT claim. tenantId verified and bound to req.ctx.tenantId.
  2. Postgres RLS. Every metadata table has <table>_tenant_isolation policy; pool sets app.tenant_id per tx.
  3. BigQuery authorized views + binding-scoped UDF. Tenants and Looker Studio tokens never see raw curated tables. They access tenant_views.<table> views that filter by SESSION_USER_TENANT_ID(), a UDF that resolves tenant_views.access_bindings. The Query API binds @tenant_id from the JWT (never from the body) when calling BigQuery and uses a per-tenant principal (looker-studio-<tenantId>@…iam.gserviceaccount.com) for embed sessions.

No managed service account ever runs SQL against curated tables on behalf of a tenant without an authorized-view binding. analytics-service's own GSA has read access to curated tables for ETL but its calls always include tenant_id predicates and labels.

The cross-tenant integration test (test/integration/tenant-isolation.spec.ts) is mandatory: with two tenants and tenant A's JWT, every read returns zero rows of B and every write (with B's id) returns 404.


4. Data classification & encryption

FieldClassificationTreatment
Metric/projection definitionsInternalTLS-only
Dashboards, widgets, saved queriesTenant dataTLS-only; OCC enforced
Looker Studio embed tokensSecretSigned JWT, ≤ 60 min; never logged; revocable via tenant_views.access_bindings removal
Curated tables (fact_*, dim_*)Tenant dataAt-rest CMEK key projects/<p>/locations/<r>/keyRings/melmastoon-analytics/cryptoKeys/curated; TLS-only
Raw events (events_raw.*)Tenant data + envelope metadataCMEK; access restricted to ETL GSA only
DQ historical resultsInternalCMEK
LogsInternalPII redactor strips email/phone patterns; never include raw event payloads

CMEK rotation is automated annually; old key versions remain decrypt-only.


5. Secret handling

  • Source of truth: Google Secret Manager.
  • Bootstrap: resolve secret resource paths at startup; never write secrets to disk.
  • Rotation: 5-min refresh; hot rotation supported.
  • Looker Studio embed signing key: stored in KMS asymmetric key melmastoon-analytics-embed-signer; we never possess the private key material directly.
  • Forbidden: logging secret payloads, embedding in env vars in CI logs.

CI runs gitleaks; deny merges that touch secret-shaped strings.


6. Audit trail

Every mutation and every denied access emits to audit-service (07 §9):

VerbSubject
metric.publishedmet_…
metric.archivedmet_…
projection.publishedprj_…
dashboard.created/updated/deleteddsh_…
widget.added/updated/removedwid_…
query.runqry_… (sampled)
dashboard.sharedtokens issued
looker.token_issuedtenant + ttl
etl.run.completed/failedetr_…
dq.alertdqc_…
binding.created/revoked(principal, tenantId)
access.deniedresource + permission + reason

Audit events include actor, causedBy.correlationId, tenantId, ip, userAgent, aiProvenance?. They participate in the daily Merkle anchor.


7. AI safety

  • All AI calls flow through ai-orchestrator-service (AI_INTEGRATION §1-§3).
  • Prompts never include guest PII; only aggregated facts.
  • Outputs that influence widgets carry an "AI-generated, review before sharing" badge.
  • Tenant admin can disable any analytics AI capability; the off-switch is enforced server-side.
  • AI-drafted widgets require human confirmation before publication.

8. Network & egress

  • All inbound traffic via Cloud Run "internal-and-cloud-load-balancing"; public path only via the API gateway.
  • Egress to external providers limited to Looker Studio (Google) and ai-orchestrator-service (internal).
  • Workload Identity bindings enforce GSA-only access to BigQuery.

9. Threat model (focused)

ThreatMitigation
Cross-tenant query via crafted SQLJWT claim binding + authorized views + UDF tenant resolution + linter forbids tenant_id overrides
Looker Studio token replay after revocationTokens short-lived; binding revocation immediate; embed sessions re-validated per page load
Rogue saved query exfiltrating data via parameter injectionSaved queries parsed at save-time; only parameter binding, no string concatenation; allowlist of datasets/tables
BigQuery slot exhaustion (DoS)Reservation pricing; per-tenant byte budget; slot autoscale ceiling
Schema drift breaks dashboards_schema_version pin + DQ schema-drift check; coexistence v1 → v2
ETL job replay double-writeMERGE keys idempotent; per-projection lock; outbox dedupe
Forecast writeback poisoningValidate envelope tenant; per-row tenant check; MELMASTOON.ANALYTICS.FORECAST_INVALID_TENANT on mismatch
Pub/Sub spoofingOIDC verification + audience pin
Embed token signing key leakKMS asymmetric key; key never leaves KMS; rotation runbook
Access-binding drift in tenant_views.access_bindingsDaily reconciliation against iam-service truth; alert on diff
Cost runaway via abusive dashboardsDaily byte budget per tenant; auto-pause snapshot generators when 80 % spent

10. Compliance

  • GDPR / DPDP / KSA PDPL: PII inventory tracked in docs/07 §11; raw events with PII have residency-aware retention.
  • Data residency: Cloud SQL and BigQuery datasets are regional; deployment topology fans out per residency. Cross-region replication is forbidden.
  • Right to erasure: receives melmastoon.tenant.deleted.v1; PurgeTenantUseCase drops authorized view bindings, deletes operational rows, anonymizes regulated rows.
  • Logging vs PII: strict redaction at the logger level + tests asserting no raw payload fields appear in stdout.

Cross-references: DATA_MODEL §3 RLS, API_CONTRACTS §0, docs/07.