SECURITY_MODEL — analytics-service
Sibling: DATA_MODEL · API_CONTRACTS · APPLICATION_LOGIC · platform anchors: docs/07 Security/Compliance/Tenancy, docs/architecture/ADR-0002
1. Authentication
| Caller | Mechanism |
|---|---|
BFF (bff-backoffice-service, bff-guest-portal-service) | Internal mTLS + signed JWT carrying userId, tenantId, propertyAccess[], roles[] |
sync-service → /internal/sync/* | Internal mTLS + service JWT (aud=analytics-service) |
ai-orchestrator-service → BigQuery (curated views, writeback) | Workload Identity-bound GSA with scoped roles |
Cloud Scheduler / Workflows → /internal/scheduler/etl | OIDC token, audience pinned, SA allow-list |
| Pub/Sub push subscriptions | OIDC verified, audience pinned |
| Looker Studio Community Connector | Per-tenant short-lived embed token (signed JWT) verified against tenant_views.access_bindings |
End-user JWTs are issued by iam-service (15 min access, 8 h refresh) and validated against the platform JWKS. Failures return MELMASTOON.IAM.AUTH_INVALID.
2. Authorization
RBAC + ABAC (07 §4). Permissions specific to analytics:
| Permission | Allows | Default roles |
|---|---|---|
analytics.viewer | Read dashboards, widgets, metric data within property scope | staff, manager, tenant.admin, tenant.owner |
analytics.author | Create/update dashboards, widgets, saved queries, run ad-hoc curated queries | manager, tenant.admin, tenant.owner |
analytics.admin | Publish metric definitions, projections, manage DQ checks, manage budgets | tenant.admin, tenant.owner |
analytics.looker_share | Issue Looker Studio embed tokens | tenant.admin, tenant.owner |
analytics.budget_admin | View per-tenant byte/cost projections | tenant.owner |
ABAC overlays:
- Property scope. Every widget filter is intersected with
jwt.propertyAccess[]. Empty intersection →MELMASTOON.ANALYTICS.PROPERTY_SCOPE_VIOLATION. - Dashboard scope vs property. A
scope='property'dashboard is invisible to users without that property in scope. - Saved query ownership. Mutations on saved queries restricted to owner unless caller has
analytics.admin.
Decisions evaluated at the application boundary and re-evaluated at the repository layer (defense-in-depth). Denials emit analytics.access_denied audit events.
3. Tenant isolation
Three independently sufficient layers:
- JWT claim.
tenantIdverified and bound toreq.ctx.tenantId. - Postgres RLS. Every metadata table has
<table>_tenant_isolationpolicy; pool setsapp.tenant_idper tx. - BigQuery authorized views + binding-scoped UDF. Tenants and Looker Studio tokens never see raw curated tables. They access
tenant_views.<table>views that filter bySESSION_USER_TENANT_ID(), a UDF that resolvestenant_views.access_bindings. The Query API binds@tenant_idfrom the JWT (never from the body) when calling BigQuery and uses a per-tenant principal (looker-studio-<tenantId>@…iam.gserviceaccount.com) for embed sessions.
No managed service account ever runs SQL against curated tables on behalf of a tenant without an authorized-view binding. analytics-service's own GSA has read access to curated tables for ETL but its calls always include tenant_id predicates and labels.
The cross-tenant integration test (test/integration/tenant-isolation.spec.ts) is mandatory: with two tenants and tenant A's JWT, every read returns zero rows of B and every write (with B's id) returns 404.
4. Data classification & encryption
| Field | Classification | Treatment |
|---|---|---|
| Metric/projection definitions | Internal | TLS-only |
| Dashboards, widgets, saved queries | Tenant data | TLS-only; OCC enforced |
| Looker Studio embed tokens | Secret | Signed JWT, ≤ 60 min; never logged; revocable via tenant_views.access_bindings removal |
Curated tables (fact_*, dim_*) | Tenant data | At-rest CMEK key projects/<p>/locations/<r>/keyRings/melmastoon-analytics/cryptoKeys/curated; TLS-only |
Raw events (events_raw.*) | Tenant data + envelope metadata | CMEK; access restricted to ETL GSA only |
| DQ historical results | Internal | CMEK |
| Logs | Internal | PII redactor strips email/phone patterns; never include raw event payloads |
CMEK rotation is automated annually; old key versions remain decrypt-only.
5. Secret handling
- Source of truth: Google Secret Manager.
- Bootstrap: resolve secret resource paths at startup; never write secrets to disk.
- Rotation: 5-min refresh; hot rotation supported.
- Looker Studio embed signing key: stored in KMS asymmetric key
melmastoon-analytics-embed-signer; we never possess the private key material directly. - Forbidden: logging secret payloads, embedding in env vars in CI logs.
CI runs gitleaks; deny merges that touch secret-shaped strings.
6. Audit trail
Every mutation and every denied access emits to audit-service (07 §9):
| Verb | Subject |
|---|---|
metric.published | met_… |
metric.archived | met_… |
projection.published | prj_… |
dashboard.created/updated/deleted | dsh_… |
widget.added/updated/removed | wid_… |
query.run | qry_… (sampled) |
dashboard.shared | tokens issued |
looker.token_issued | tenant + ttl |
etl.run.completed/failed | etr_… |
dq.alert | dqc_… |
binding.created/revoked | (principal, tenantId) |
access.denied | resource + permission + reason |
Audit events include actor, causedBy.correlationId, tenantId, ip, userAgent, aiProvenance?. They participate in the daily Merkle anchor.
7. AI safety
- All AI calls flow through
ai-orchestrator-service(AI_INTEGRATION §1-§3). - Prompts never include guest PII; only aggregated facts.
- Outputs that influence widgets carry an "AI-generated, review before sharing" badge.
- Tenant admin can disable any analytics AI capability; the off-switch is enforced server-side.
- AI-drafted widgets require human confirmation before publication.
8. Network & egress
- All inbound traffic via Cloud Run "internal-and-cloud-load-balancing"; public path only via the API gateway.
- Egress to external providers limited to Looker Studio (Google) and
ai-orchestrator-service(internal). - Workload Identity bindings enforce GSA-only access to BigQuery.
9. Threat model (focused)
| Threat | Mitigation |
|---|---|
| Cross-tenant query via crafted SQL | JWT claim binding + authorized views + UDF tenant resolution + linter forbids tenant_id overrides |
| Looker Studio token replay after revocation | Tokens short-lived; binding revocation immediate; embed sessions re-validated per page load |
| Rogue saved query exfiltrating data via parameter injection | Saved queries parsed at save-time; only parameter binding, no string concatenation; allowlist of datasets/tables |
| BigQuery slot exhaustion (DoS) | Reservation pricing; per-tenant byte budget; slot autoscale ceiling |
| Schema drift breaks dashboards | _schema_version pin + DQ schema-drift check; coexistence v1 → v2 |
| ETL job replay double-write | MERGE keys idempotent; per-projection lock; outbox dedupe |
| Forecast writeback poisoning | Validate envelope tenant; per-row tenant check; MELMASTOON.ANALYTICS.FORECAST_INVALID_TENANT on mismatch |
| Pub/Sub spoofing | OIDC verification + audience pin |
| Embed token signing key leak | KMS asymmetric key; key never leaves KMS; rotation runbook |
Access-binding drift in tenant_views.access_bindings | Daily reconciliation against iam-service truth; alert on diff |
| Cost runaway via abusive dashboards | Daily byte budget per tenant; auto-pause snapshot generators when 80 % spent |
10. Compliance
- GDPR / DPDP / KSA PDPL: PII inventory tracked in docs/07 §11; raw events with PII have residency-aware retention.
- Data residency: Cloud SQL and BigQuery datasets are regional; deployment topology fans out per residency. Cross-region replication is forbidden.
- Right to erasure: receives
melmastoon.tenant.deleted.v1;PurgeTenantUseCasedrops authorized view bindings, deletes operational rows, anonymizes regulated rows. - Logging vs PII: strict redaction at the logger level + tests asserting no raw payload fields appear in stdout.
Cross-references: DATA_MODEL §3 RLS, API_CONTRACTS §0, docs/07.