SERVICE_RISK_REGISTER — lock-integration-service
Bundle: SERVICE_OVERVIEW · SECURITY_MODEL · FAILURE_MODES · SERVICE_READINESS · MIGRATION_PLAN
Cross-cutting: docs/02 §13 Resilience, docs/architecture/ADR-0004 §Risks.
Living document. Reviewed quarterly by the service owner + security lead. Each risk has a probability × impact rating (1–5 scale), a mitigation owner, and a status.
1. Risk scoring
| Score | Likelihood | Impact |
|---|---|---|
| 1 | Very rare | Negligible (no guest impact, easy recovery) |
| 2 | Rare | Single-property minor (one shift) |
| 3 | Possible | Multi-property minor or single-property severe |
| 4 | Likely | Tenant-wide outage or sustained brand damage |
| 5 | Almost certain | Multi-tenant catastrophic, regulatory exposure |
Risk score = likelihood × impact. Risk ≥ 12 requires explicit ownership and mitigation milestone in the next quarter.
2. Register
| ID | Risk | Likelihood | Impact | Score | Status | Owner | Mitigation |
|---|---|---|---|---|---|---|---|
| R-LOCK-01 | Vendor cloud sustained outage (TTLock/Salto/Vostio API) prevents new key issuance for hours | 3 | 4 | 12 | mitigating | Lock domain lead | Per-vendor circuit breaker + Electron offline issuance fallback (GA-Offline tier); tenant comms playbook; multi-vendor encouraged at chain operators |
| R-LOCK-02 | Vendor unilaterally breaks API contract (deprecation, undocumented behavior change) | 3 | 3 | 9 | monitoring | Lock domain lead | Adapter contract suite runs nightly against sandboxes; Jira-tracked; vendor relationship contacts; vendor_adapters.config_jsonb allows hot-patch flag toggles |
| R-LOCK-03 | Vendor credential leak (insider, repo leak, dependency compromise) | 2 | 5 | 10 | mitigating | Security lead | Per-tenant CMEK, restricted IAM, secret-scan CI, periodic rotation, no plaintext in DB; access audit-logged & alerted on out-of-pattern access |
| R-LOCK-04 | Webhook spoofing leads to forged credential state mutations | 2 | 5 | 10 | mitigating | Security lead | Signature verification per vendor + replay dedup + rate-limit + Cloud Armor; alert on webhook_signature_failed_total spikes |
| R-LOCK-05 | Stolen/lost Electron desktop with active offline-issuance cert → forged keys | 2 | 4 | 8 | mitigating | Security lead | Device-bound Ed25519 key in OS keychain; CRL on unbind; cap on maxIssuances + validUntil ≤ 14d; HITL review of all credentials issued in last cert window post-incident |
| R-LOCK-06 | Master key off-shift abuse (insider) | 3 | 4 | 12 | mitigating | Lock domain lead + Security | Time-bound shift binding; anomaly scoring (HITL); audit retention 7y; per-tenant policy can auto-suspend at score ≥ 0.95 |
| R-LOCK-07 | Audit log tampering (DB-level UPDATE/DELETE) | 1 | 5 | 5 | mitigating | Security lead | App role lacks UPDATE/DELETE on lock_audit/key_credential_attempts; daily Merkle anchoring; mismatch alert is P1 |
| R-LOCK-08 | Two staff issue keys for the same room concurrently → conflicting active credentials | 3 | 3 | 9 | mitigated | Lock domain lead | Postgres advisory lock per (propertyId, roomId); reservation race tested (F7 in FAILURE_MODES) |
| R-LOCK-09 | Time skew on lock device causes "valid-window" credential to be rejected at the door | 4 | 3 | 12 | mitigating | Lock domain lead | Vendor-side time-sync commands; device.health_alert.v1 with clock_drift; window padding (5 min) on valid_from; field maintenance loop |
| R-LOCK-10 | Mobile-key push delivery fails silently (notification-service or carrier issue) | 3 | 3 | 9 | mitigating | Notification service owner | Delivery receipts; auto-fallback to PIN delivery via SMS; front desk dashboard surfaces undelivered keys |
| R-LOCK-11 | Salto on-prem connector tunnel down at a property | 3 | 3 | 9 | mitigating | Platform SRE | Cloud VPN HA tunnel; synthetic check; per-property circuit; manual override path documented |
| R-LOCK-12 | Encoder USB hardware failure mid-session | 3 | 3 | 9 | mitigating | Lock domain lead | Hot-swap supported; in-flight sessions gracefully closed; manual override printable PIN slip |
| R-LOCK-13 | Vendor SDK introduces native compile incompatibility with Node 20 LTS | 2 | 3 | 6 | monitoring | Lock domain lead | All vendor adapters wrap HTTP, not native SDK, where possible; pinned Node version + Renovate watching SDK releases |
| R-LOCK-14 | Pub/Sub subscription drift causes silent event loss | 2 | 4 | 8 | mitigated | Platform SRE | IaC (Terraform) is source of truth; drift alarm; inbox-lag SLO catches lost subscriptions quickly |
| R-LOCK-15 | Provisional offline credential never reconciles (desktop offline forever) | 2 | 3 | 6 | mitigating | Lock domain lead | LockOfflineCertNoReconcileLong alert; per-tenant SLA for desktop sync; CRL on cert expiry forces local revocation |
| R-LOCK-16 | Tenant operator misconfigures KeyKindPolicy (e.g., disables PIN fallback when mobile fails) | 4 | 2 | 8 | accepted with mitigation | Product | Backoffice UI surfaces fallback consequences; sane defaults; "test issuance" wizard before save |
| R-LOCK-17 | Cross-tenant data leak via RLS bypass bug | 1 | 5 | 5 | mitigating | Security lead | Strict RLS, tested in CI; integration tests assert cross-tenant attempts fail; quarterly RLS audit; TENANT_RLS_ENFORCEMENT=strict in dev |
| R-LOCK-18 | Vendor pricing change (per-call cost) makes a vendor uneconomic | 3 | 2 | 6 | accepted | Product | analytics-service cost dashboard per vendor; multi-vendor optionality |
| R-LOCK-19 | Regulatory change in target market mandates additional audit fields | 3 | 3 | 9 | monitoring | Product + Compliance | lock_audit.payload jsonb is forward-compatible; schema-registry versioning supports adding fields without breaking consumers |
| R-LOCK-20 | Catastrophic loss of offline-issuance signing key in KMS | 1 | 5 | 5 | mitigating | Security lead | KMS auto-rotation; key versions retained; emergency mint-new-cert path tested in DR drill |
| R-LOCK-21 | Subject-deletion request can't be fully honored due to audit retention | 3 | 2 | 6 | accepted | Compliance | Pseudonymization satisfies GDPR Art.17 with retention exception; documented in DPA |
| R-LOCK-22 | Vendor health monitoring false positives → unnecessary circuit-breaks | 3 | 2 | 6 | monitoring | Lock domain lead | Hysteresis + multi-probe consensus; SLO tracks false-positive rate |
| R-LOCK-23 | Migration of legacy property from one vendor to another (R-LOCK-02-style or commercial decision) | 2 | 3 | 6 | mitigated | Lock domain lead | MIGRATION_PLAN §3 vendor switch; dual-write window |
| R-LOCK-24 | Knowledge concentration on a small team (truck-factor) | 3 | 3 | 9 | mitigating | Engineering manager | Pair rotation; this 18-doc bundle is the canonical knowledge base; quarterly architecture walkthroughs |
3. Top-3 quarterly focus
For Q-current, the top-3 risks (≥12) under active mitigation:
- R-LOCK-01 — Vendor sustained outage. Milestone: ship offline issuance to GA-Offline tier this quarter.
- R-LOCK-06 — Master key off-shift abuse. Milestone: HITL anomaly Decision flow live and reviewed by 100% of pilot tenants.
- R-LOCK-09 — Device time skew. Milestone: per-vendor time-sync command implemented for TTLock + Salto; auto-window-padding live.
4. Process
- Quarterly review: service owner + security lead walk the register, update statuses, add new risks, retire mitigated ones.
- New risk intake: any incident postmortem must check whether a new risk row is warranted; if so, add with status
monitoringand owner. - Linkage: every risk references either a FAILURE_MODES row or a SECURITY_MODEL section so mitigations stay grounded in implementation.