SECURITY_MODEL — analytics-service

Sibling: DATA_MODEL · API_CONTRACTS · APPLICATION_LOGIC · platform anchors: docs/07 Security/Compliance/Tenancy, docs/architecture/ADR-0002

1. Authentication

Caller	Mechanism
BFF (`bff-backoffice-service`, `bff-guest-portal-service`)	Internal mTLS + signed JWT carrying `userId`, `tenantId`, `propertyAccess[]`, `roles[]`
`sync-service` → `/internal/sync/*`	Internal mTLS + service JWT (`aud=analytics-service`)
`ai-orchestrator-service` → BigQuery (curated views, writeback)	Workload Identity-bound GSA with scoped roles
Cloud Scheduler / Workflows → `/internal/scheduler/etl`	OIDC token, audience pinned, SA allow-list
Pub/Sub push subscriptions	OIDC verified, audience pinned
Looker Studio Community Connector	Per-tenant short-lived embed token (signed JWT) verified against `tenant_views.access_bindings`

End-user JWTs are issued by iam-service (15 min access, 8 h refresh) and validated against the platform JWKS. Failures return MELMASTOON.IAM.AUTH_INVALID.

2. Authorization

RBAC + ABAC (07 §4). Permissions specific to analytics:

Permission	Allows	Default roles
`analytics.viewer`	Read dashboards, widgets, metric data within property scope	`staff`, `manager`, `tenant.admin`, `tenant.owner`
`analytics.author`	Create/update dashboards, widgets, saved queries, run ad-hoc curated queries	`manager`, `tenant.admin`, `tenant.owner`
`analytics.admin`	Publish metric definitions, projections, manage DQ checks, manage budgets	`tenant.admin`, `tenant.owner`
`analytics.looker_share`	Issue Looker Studio embed tokens	`tenant.admin`, `tenant.owner`
`analytics.budget_admin`	View per-tenant byte/cost projections	`tenant.owner`

ABAC overlays:

Property scope. Every widget filter is intersected with jwt.propertyAccess[]. Empty intersection → MELMASTOON.ANALYTICS.PROPERTY_SCOPE_VIOLATION.
Dashboard scope vs property. A scope='property' dashboard is invisible to users without that property in scope.
Saved query ownership. Mutations on saved queries restricted to owner unless caller has analytics.admin.

Decisions evaluated at the application boundary and re-evaluated at the repository layer (defense-in-depth). Denials emit analytics.access_denied audit events.

3. Tenant isolation

Three independently sufficient layers:

JWT claim. tenantId verified and bound to req.ctx.tenantId.
Postgres RLS. Every metadata table has <table>_tenant_isolation policy; pool sets app.tenant_id per tx.
BigQuery authorized views + binding-scoped UDF. Tenants and Looker Studio tokens never see raw curated tables. They access tenant_views.<table> views that filter by SESSION_USER_TENANT_ID(), a UDF that resolves tenant_views.access_bindings. The Query API binds @tenant_id from the JWT (never from the body) when calling BigQuery and uses a per-tenant principal (looker-studio-<tenantId>@…iam.gserviceaccount.com) for embed sessions.

No managed service account ever runs SQL against curated tables on behalf of a tenant without an authorized-view binding. analytics-service's own GSA has read access to curated tables for ETL but its calls always include tenant_id predicates and labels.

The cross-tenant integration test (test/integration/tenant-isolation.spec.ts) is mandatory: with two tenants and tenant A's JWT, every read returns zero rows of B and every write (with B's id) returns 404.

4. Data classification & encryption

Field	Classification	Treatment
Metric/projection definitions	Internal	TLS-only
Dashboards, widgets, saved queries	Tenant data	TLS-only; OCC enforced
Looker Studio embed tokens	Secret	Signed JWT, ≤ 60 min; never logged; revocable via `tenant_views.access_bindings` removal
Curated tables (`fact_`, `dim_`)	Tenant data	At-rest CMEK key `projects/<p>/locations/<r>/keyRings/melmastoon-analytics/cryptoKeys/curated`; TLS-only
Raw events (`events_raw.*`)	Tenant data + envelope metadata	CMEK; access restricted to ETL GSA only
DQ historical results	Internal	CMEK
Logs	Internal	PII redactor strips email/phone patterns; never include raw event payloads

CMEK rotation is automated annually; old key versions remain decrypt-only.

5. Secret handling

Source of truth: Google Secret Manager.
Bootstrap: resolve secret resource paths at startup; never write secrets to disk.
Rotation: 5-min refresh; hot rotation supported.
Looker Studio embed signing key: stored in KMS asymmetric key melmastoon-analytics-embed-signer; we never possess the private key material directly.
Forbidden: logging secret payloads, embedding in env vars in CI logs.

CI runs gitleaks; deny merges that touch secret-shaped strings.

6. Audit trail

Every mutation and every denied access emits to audit-service (07 §9):

Verb	Subject
`metric.published`	`met_…`
`metric.archived`	`met_…`
`projection.published`	`prj_…`
`dashboard.created/updated/deleted`	`dsh_…`
`widget.added/updated/removed`	`wid_…`
`query.run`	`qry_…` (sampled)
`dashboard.shared`	tokens issued
`looker.token_issued`	tenant + ttl
`etl.run.completed/failed`	`etr_…`
`dq.alert`	`dqc_…`
`binding.created/revoked`	`(principal, tenantId)`
`access.denied`	resource + permission + reason

Audit events include actor, causedBy.correlationId, tenantId, ip, userAgent, aiProvenance?. They participate in the daily Merkle anchor.

7. AI safety

All AI calls flow through ai-orchestrator-service (AI_INTEGRATION §1-§3).
Prompts never include guest PII; only aggregated facts.
Outputs that influence widgets carry an "AI-generated, review before sharing" badge.
Tenant admin can disable any analytics AI capability; the off-switch is enforced server-side.
AI-drafted widgets require human confirmation before publication.

8. Network & egress

All inbound traffic via Cloud Run "internal-and-cloud-load-balancing"; public path only via the API gateway.
Egress to external providers limited to Looker Studio (Google) and ai-orchestrator-service (internal).
Workload Identity bindings enforce GSA-only access to BigQuery.

9. Threat model (focused)

Threat	Mitigation
Cross-tenant query via crafted SQL	JWT claim binding + authorized views + UDF tenant resolution + linter forbids `tenant_id` overrides
Looker Studio token replay after revocation	Tokens short-lived; binding revocation immediate; embed sessions re-validated per page load
Rogue saved query exfiltrating data via parameter injection	Saved queries parsed at save-time; only parameter binding, no string concatenation; allowlist of datasets/tables
BigQuery slot exhaustion (DoS)	Reservation pricing; per-tenant byte budget; slot autoscale ceiling
Schema drift breaks dashboards	`_schema_version` pin + DQ schema-drift check; coexistence v1 → v2
ETL job replay double-write	MERGE keys idempotent; per-projection lock; outbox dedupe
Forecast writeback poisoning	Validate envelope tenant; per-row tenant check; `MELMASTOON.ANALYTICS.FORECAST_INVALID_TENANT` on mismatch
Pub/Sub spoofing	OIDC verification + audience pin
Embed token signing key leak	KMS asymmetric key; key never leaves KMS; rotation runbook
Access-binding drift in `tenant_views.access_bindings`	Daily reconciliation against `iam-service` truth; alert on diff
Cost runaway via abusive dashboards	Daily byte budget per tenant; auto-pause snapshot generators when 80 % spent

10. Compliance

GDPR / DPDP / KSA PDPL: PII inventory tracked in docs/07 §11; raw events with PII have residency-aware retention.
Data residency: Cloud SQL and BigQuery datasets are regional; deployment topology fans out per residency. Cross-region replication is forbidden.
Right to erasure: receives melmastoon.tenant.deleted.v1; PurgeTenantUseCase drops authorized view bindings, deletes operational rows, anonymizes regulated rows.
Logging vs PII: strict redaction at the logger level + tests asserting no raw payload fields appear in stdout.

Cross-references: DATA_MODEL §3 RLS, API_CONTRACTS §0, docs/07.

1. Authentication​

2. Authorization​

3. Tenant isolation​

4. Data classification & encryption​

5. Secret handling​

6. Audit trail​

7. AI safety​

8. Network & egress​

9. Threat model (focused)​

10. Compliance​