Skip to main content

SERVICE_RISK_REGISTER — reporting-service

Sibling: FAILURE_MODES · SERVICE_READINESS · MIGRATION_PLAN

Risks are scored on Likelihood (L) and Impact (I), each 1-5; Score = L × I. Owners and review cadence are explicit. The register is reviewed each sprint and after every P1.


1. Active risks

IDRiskLIScoreOwnerMitigation in flightTrigger / DetectionLast review
R-REP-001Regulatory submission missed past statutory cutoff2510Reporting TLCron monitor + P1 alert; manual escalation runbook; per-jurisdiction grace window pre-cutoff alertnext_attempt_at < cutoff query2026-04-15
R-REP-002Cross-tenant data exposure in rendered PDF (filter bug)155Security guildRLS + integration tests + property-based filter tests; renderer asserts every rendered row matches tenantIdIntegration test + access-denied audit spike2026-04-15
R-REP-003BigQuery slot exhaustion at month-end4312Platform SREReservation pricing for analytics; per-template rowCap; smoothing of schedules across the hour; opportunistic queue drainingreporting.upstream_analytics_timeout rate spike2026-04-12
R-REP-004Renderer Chromium CVE forces emergency upgrade339Reporting TLPinned digest image rebuilt nightly; CVE feed subscription; canary on revision before promoteSecurity advisory + image scanner2026-04-08
R-REP-005Template versioning conflict during scheduled run248Reporting TLSnapshot template version at run-start; runs carry templateVersionId; never mutate published versionsSaga test + production audit2026-04-15
R-REP-006Subscription email exfiltration to attacker domain248Security guildPer-tenant allow-list of recipient domains; verification step on first email; admin approval for new domainsRecipient domain audit2026-04-08
R-REP-007OOM during render of pathological dataset339Reporting TLStreaming + pagination; per-template row caps; Puppeteer page-count cap; pod resource limitsOOM kill metric2026-04-15
R-REP-008AI hallucination in callout misleads operator248AI guild + Reporting TL"AI-generated, review before sharing" badge; HITL for drafted templates; callouts must cite fact ids physically presentUser report + sample audit2026-04-08
R-REP-009Cloud Scheduler regional outage misses fires248Platform SREStandby Scheduler in alternate region same residency; backfill on recovery; alert on schedule driftreporting_schedule_drift_seconds p952026-04-12
R-REP-010Object lock misconfiguration allows premature delete155Platform SREIaC enforces lock; service account lacks delete; weekly drift scanIaC drift alert2026-04-12
R-REP-011Adapter credentials leaked via logs155Security guildSecret Manager only; logger redactor; gitleaks CI; periodic adapter test in devLog scan2026-04-08
R-REP-012Tenant deletion cascade leaves orphaned regulatory artifacts236ComplianceAnonymize-not-delete for regulatory bucket; explicit hold flag; legal reviewSynthetic tenant deletion2026-04-15
R-REP-013Locale rendering bug breaks RTL layout (Pashto/Arabic)339Frontend + ReportingPer-locale golden tests; bidi font fallback; visual diff on PRGolden snapshot diff2026-04-08
R-REP-014Outbox publisher lag during Pub/Sub incident339Platform SREBounded backlog; pod-side metric; auto-scale up; replay toolsreporting_outbox_lag_seconds2026-04-12
R-REP-015Long-running Cloud SQL transactions cause lock contention339Reporting TLUse cases keep tx short; offload heavy reads to read replicas; query timeoutpg_locks watch2026-04-15
R-REP-016Sync engine pulls stale subscription causing missed delivery236Desktop platformWebSocket nudge on report.completed.v1; pull on app foreground; cursor signedSynthetic delivery2026-04-08
R-REP-017Operator schedules collide and exceed worker pool326Reporting TLToken-bucket per tenant; cluster-wide soft cap; MELMASTOON.REPORTING.SCHEDULE_RATE_LIMITED graceful back-offSaturation alert2026-04-12
R-REP-018Migration drift between regions during canary248Platform SREForward-only additive migrations; deploy gate runs migration firstCloud Deploy step status2026-04-08
R-REP-019Excessive AI cost in a tenant burns shared budget326AI guildPer-tenant budget caps; orchestrator returns BUDGET_EXHAUSTED; alert tenant.ownerCost counter2026-04-08
R-REP-020Data residency violation via wrong regional bucket155Platform SREIaC pins bucket per region; runtime asserts bucket residency vs tenantDrift check2026-04-12

2. Closed risks

IDRiskClosed whenClosure note
R-REP-C001Single-region deployment did not meet residency obligations2026-03Multi-region per-residency stack shipped; per-region buckets and CMEK

3. Risk treatment matrix

ScoreTreatment
15-25Mitigate now: dedicated workstream, weekly executive review
9-14Mitigate next: roadmap item, sprint owner, monthly review
4-8Monitor: documented mitigations, quarterly review
1-3Accept with light review

4. Heat map

I=1 I=2 I=3 I=4 I=5
L=5 · · · · ·
L=4 · · R-003 · ·
L=3 · R-017 R-004 R-013 ·
· R-007 R-014 ·
· R-015 R-019 ·
L=2 · R-016 R-012 R-005 R-001
· R-018 R-009 ·
· R-006 ·
· R-008 ·
L=1 · · · · R-002
R-010
R-011
R-020

5. Review cadence & ownership

  • Sprint review (every 2 weeks): Reporting TL chairs; revisits scores; closes resolved risks.
  • Post-incident: any P1 triggers a same-day risk review; new risks added; existing risk likelihoods recalibrated.
  • Quarterly compliance review: Compliance & legal sign off on R-REP-001, R-REP-006, R-REP-010, R-REP-012, R-REP-020.

Cross-references: FAILURE_MODES, SECURITY_MODEL, SERVICE_READINESS.