SERVICE_RISK_REGISTER — staff-service
Sibling: SERVICE_READINESS · FAILURE_MODES · SECURITY_MODEL · AI_INTEGRATION
Living register of risks specific to staff-service. Each entry has a category, likelihood × impact heat, mitigation in place, residual risk, owner, and a review cadence. Reviewed quarterly by the service tech lead and platform architect.
Heat scale: L (low) / M (medium) / H (high) / C (critical) for each axis. Score = max(L, I).
1. Domain & Data Risks
R-01 Cross-tenant data leak
| Field | Value |
|---|---|
| Category | Security · Multi-tenancy |
| Likelihood | L |
| Impact | C |
| Score | C |
| Description | A bug or RLS bypass leaks staff records, PINs, or schedules across tenants |
| Mitigation | Mandatory RLS on every table; CI sweep rls-isolation.spec.ts; service-account never bypasses RLS without audit; SQL reviews block raw queries |
| Residual risk | L |
| Owner | Service tech lead |
| Review cadence | Quarterly |
R-02 PIN brute-force
| Field | Value |
|---|---|
| Category | Security · Authentication |
| Likelihood | M |
| Impact | H |
| Score | H |
| Description | Attacker on the front desk attempts to guess PINs to clock in/out as another staff |
| Mitigation | Per-staff lockout 15 min after 5 failures; per-device 1 attempt / 2 s; per-property 30 attempts/min; HMAC w/ KMS pepper |
| Residual risk | L |
| Owner | Security architect |
| Review cadence | Quarterly |
R-03 PII exposure (emergency_contact, languages, certifications)
| Field | Value |
|---|---|
| Category | Privacy · Compliance |
| Likelihood | M |
| Impact | H |
| Score | H |
| Description | Operators read PII without business need |
| Mitigation | Field-level envelope encryption for emergency_contact; capability gate staff.read_pii; mandatory audit row on every read; quarterly access review |
| Residual risk | M |
| Owner | Security architect |
| Review cadence | Quarterly |
R-04 Audit log tampering
| Field | Value |
|---|---|
| Category | Compliance |
| Likelihood | L |
| Impact | H |
| Score | H |
| Description | Operator or rogue process attempts to mutate or delete audit_events |
| Mitigation | Append-only DB triggers; admin role required for legitimate updates (and audited); BigQuery cold copy beyond service reach |
| Residual risk | L |
| Owner | Security architect |
| Review cadence | Quarterly |
2. Operational Risks
R-05 IAM revoke cascade fails on termination
| Field | Value |
|---|---|
| Category | Security · Operational |
| Likelihood | L |
| Impact | H |
| Score | H |
| Description | Terminated staff retains active iam session and can still authenticate to other services |
| Mitigation | Per APPLICATION_LOGIC §3.5: saga-style retry; alert on iam.session.revoke.failure_count; runbook for manual intervention |
| Residual risk | L |
| Owner | SRE |
| Review cadence | Quarterly |
R-06 Outbox stalls causing assignment-service starvation
| Field | Value |
|---|---|
| Category | Reliability |
| Likelihood | M |
| Impact | M |
| Score | M |
| Description | Capacity events (clock.in/out) delayed; downstream auto-assignment stalls |
| Mitigation | Outbox depth alert P1; auto-scale workers; circuit breaker around publisher; idempotent re-publish |
| Residual risk | L |
| Owner | SRE |
| Review cadence | Quarterly |
R-07 Late-replay punches distorting attendance reports
| Field | Value |
|---|---|
| Category | Data integrity |
| Likelihood | M |
| Impact | M |
| Score | M |
| Description | Electron device offline > 24 h syncs old punches into the schedule grid; reports drift |
| Mitigation | flagged_late_replay flag retained on entry; reports filter by default; UI surfaces a banner |
| Residual risk | M |
| Owner | Service tech lead |
| Review cadence | Quarterly |
R-08 Schedule generation thrash on retries
| Field | Value |
|---|---|
| Category | Reliability |
| Likelihood | L |
| Impact | L |
| Score | L |
| Description | Retried GenerateShifts produces duplicate shifts |
| Mitigation | Idempotency-Key + EXCLUDE constraint |
| Residual risk | L |
| Owner | Service tech lead |
| Review cadence | Annually |
3. Domain & Product Risks
R-09 Informal staffing not modeled cleanly
| Field | Value |
|---|---|
| Category | Domain fit |
| Likelihood | M |
| Impact | M |
| Score | M |
| Description | Owners hire relatives/temp staff during peak who lack ID papers; data model can't capture this without bloat |
| Mitigation | Staff.employmentType ∈ {temporary, casual}; staff_id_external optional; PIN-only allowed (no email) |
| Residual risk | L |
| Owner | Product owner |
| Review cadence | Quarterly |
R-10 Multilingual staff names cause sort/search bugs
| Field | Value |
|---|---|
| Category | UX |
| Likelihood | M |
| Impact | L |
| Score | M |
| Description | RTL (ps/fa/ar) names mixed with LTR; sort, search, and PDF export look wrong |
| Mitigation | LocalizedLabel with explicit script tag; collation-aware indexes (COLLATE und-x-icu) |
| Residual risk | L |
| Owner | Service tech lead |
| Review cadence | Annually |
R-11 Manager override abused
| Field | Value |
|---|---|
| Category | Compliance · Domain |
| Likelihood | M |
| Impact | M |
| Score | M |
| Description | Managers retroactively punch staff in/out to inflate or deflate hours |
| Mitigation | Mandatory reason; audit row per override; weekly per-actor anomaly alert; report visible to GM |
| Residual risk | M |
| Owner | Compliance lead |
| Review cadence | Quarterly |
R-12 Scope creep into payroll
| Field | Value |
|---|---|
| Category | Architectural |
| Likelihood | M |
| Impact | H |
| Score | H |
| Description | Stakeholders push for tax brackets, payslips, statutory deductions inside staff-service |
| Mitigation | Bounded context ratified; SERVICE_OVERVIEW §4 Out-of-Scope; ADR required to expand |
| Residual risk | M |
| Owner | Service tech lead |
| Review cadence | Quarterly |
4. AI Risks
R-13 Edge anomaly model false positives causing operator backlash
| Field | Value |
|---|---|
| Category | AI · UX |
| Likelihood | M |
| Impact | M |
| Score | M |
| Description | Front-desk Electron flags too many legit punches; operators bypass the warning routinely |
| Mitigation | FPR baseline < 8 %; alert > 12 %; channel-pin to last-known-good model; operator can dismiss without manager |
| Residual risk | L |
| Owner | AI lead |
| Review cadence | Quarterly |
R-14 Bias in shift-suggestion model
| Field | Value |
|---|---|
| Category | AI · Compliance |
| Likelihood | L |
| Impact | H |
| Score | H |
| Description | Suggestions consistently disadvantage one demographic (gender, language, age proxy) |
| Mitigation | Bias audit quarterly; advisory-only; manager confirms every suggestion; weekly fairness report |
| Residual risk | L |
| Owner | AI lead |
| Review cadence | Quarterly |
5. Compliance & Regulatory
R-15 Data residency violation
| Field | Value |
|---|---|
| Category | Compliance |
| Likelihood | L |
| Impact | C |
| Score | C |
| Description | Tenant data leaves the contractually-bound region |
| Mitigation | Region-pinned Cloud SQL + KMS keyring; outbound to AI orchestrator stays in-region; DSAR exports respect region |
| Residual risk | L |
| Owner | Compliance lead |
| Review cadence | Quarterly |
R-16 DSAR export missed SLA
| Field | Value |
|---|---|
| Category | Compliance |
| Likelihood | L |
| Impact | M |
| Score | M |
| Description | Staff DSAR not delivered within 30 days |
| Mitigation | DSAR ticket with auto-deadline; alert at 24 h before deadline; benchmarked job < 1 h |
| Residual risk | L |
| Owner | Privacy officer |
| Review cadence | Quarterly |
6. Heat Map (current quarter)
Impact
L M H C
L R-15
M R-08
Likelihood
M R-10 R-07,R-09,R-11,R-13 R-03,R-12,R-14
H R-02,R-05,R-06
C
(R-01, R-04, R-16 plotted with their post-mitigation residual values; underlying inherent heat documented in their tables.)
7. Newly-Introduced Risks Process
When a new risk is identified (incident, code review, audit), the discoverer files a PR adding a row here. The service tech lead triages within 5 business days and either:
- Accepts (mitigation defined, owner assigned, review cadence set)
- Escalates to platform-level risk register
- Closes with rationale (not applicable, accepted as-is, mitigated by upstream change)
The register is part of the quarterly platform risk review and the basis of any production-incident postmortem's "could we have known?" section.