Skip to main content

property-service — SERVICE_RISK_REGISTER

Companion: SERVICE_READINESS · SECURITY_MODEL · FAILURE_MODES · SYNC_CONTRACT · AI_INTEGRATION

This is the binding risk register for property-service. Each risk has an owner, an exposure scoring (likelihood × impact), the current mitigations in place, and the residual exposure after mitigation. Any risk graded High residual must reference a Linear ticket and an ADR if mitigation requires architectural change.

Likelihood: L (low ≤ 5%/qtr), M (med 5–25%), H (high > 25%). Impact: tenant / cross-tenant / consumer / regulatory.


1. Register

#RiskLImpactMitigationResidualOwner
R-01Cross-tenant data exposure via missing or regressed RLS policyLCross-tenant + regulatory4-layer isolation (domain, app, DB RLS FORCEd, outbox guard); nightly audit job; integration tests per tableLowService lead + Security
R-02Property published with stale or incorrect amenities (regional sensitivity, e.g., missing halal_kitchen)MTenant reputationHITL accept on AI suggestions; canonical amenity registry; UX shows regional pack hints; operator QA checklist before publishMediumProduct
R-03Multi-language field rendering bug (mixed RTL/LTR)MTenant + consumerFirst-class i18n in DTOs; bidi fixtures in tests; native-script fields stored separately; visual regression via Playwright on BFFMediumFrontend platform
R-04Geocoding inaccuracy → wrong pin on consumer meta mapMConsumerConfidence threshold from geo-service; AI fallback gated on operator opt-in + visual confirmation; map preview before publishMediumService lead
R-05Photo upload of inappropriate contentLConsumer + regulatoryScanner via file-storage-service; photos start uploaded and gate on clean verdict; audit + operator-attributed eventLowService lead
R-06AI generates culturally inappropriate description (regional moderation fail)LTenant + consumer + regulatoryStrict moderation in orchestrator with regional rule pack (Pashto/Dari/Persian/Tajik); HITL accept always required; staged-only persistenceLowAI platform
R-07Outbox publisher stall causes prolonged downstream driftLTenant + consumerLag SLO + alert; backpressure-aware publisher; load-tested 10k backlog drainLowSRE
R-08Pub/Sub regional outage exceeds outbox retentionLTenant + consumer7-day retention on DLQ; outbox table sized for ≥ 24 h backlog; documented manual replay path; cross-region fallback runbookLowSRE
R-09Sync conflict storm after a buggy desktop releaseMTenant (operator UX)lww+diff is bounded by state-machine validation; per-device throttling; conflict-rate alert; rollback playbook for desktop binariesMediumDesktop + Service lead
R-10Sync cursor regression after a service deployLTenant (offline data)Canary + auto-rollback; cursor monotonicity test in CI; documented full-reset fallback for clientsLowSRE
R-11Long-tail AI quota burn from a single tenant (cost spike)MInternal costPer-capability daily caps; per-tenant monthly quota; quota dashboard; alert on quota exhaustion spikesLowFinance + Service lead
R-12RoomType / Room status semantic drift between this service, housekeeping-service, and inventory-serviceMOperator confusion + booking errorsBounded-context contracts published; cross-service E2E tests; periodic invariant review across teamsMediumArchitecture
R-13Lock device binding mismatch (room ↔ device id)LOperator + safetySingle-binding constraint; conflict event rejected; lock-integration owns device truthLowLock-integration team
R-14GDPR erasure incompleteness for guest-likeness photosLRegulatorycontainsGuestLikeness=true flagging at upload; subject-erasure consumer archives flagged photos; DPIA on fileMediumCompliance
R-15Property archive while hidden reservations remain (consistency)LTenant + guestArchive precondition checks reservation-service port; archive blocked otherwise; alert if reservation-service port times outLowService lead
R-16Hot tenant skew (one tenant dominates writes/storage)MInternal cost + neighbor noisePer-tenant cost panel; shared-tenant tier vs dedicated tier (hybrid model); per-tenant rate limitsMediumPlatform
R-17Schema migration failure in productionLTenant (write availability)Expand → backfill → contract; Cloud Run Job runs migrations pre-traffic; auto-rollback on failure; paired down.sql reviewed at PRLowSRE
R-18Memorystore outage degrades read SLOLTenant + consumerCache-miss fall-through; Cloud SQL sized for 3× read load; alert wiredLowSRE
R-19OPA bundle stale → policy decisions divergeLOperator (denied valid actions)30-min freshness check + alert; service holds last-good bundle 1 h; coordinated bundle releasesLowIAM team
R-20Vendor lock-in to PostGIS for geo capabilityLStrategicGeo abstracted behind a port; bbox/nearby tested via the abstraction; alternate provider feasibleLowArchitecture
R-21AI orchestrator response payload driftLOperator (specific capability fails)Schema validation per capability; alert on schema violations; capability disable flag per tenantLowAI platform
R-22Outbox table growth unbounded if publisher disabledLInternal (storage cost)Publisher health alert + automatic ticket; retention policy after publish (rows older than 30 days archived)LowSRE
R-23Unbounded photo count per property → DOS via storageLInternal costPer-property photo cap (200) enforced at API; per-tenant total cap; quota events surfaced to billingLowService lead
R-24Misuse of bulk room create (e.g., 10k rooms)LInternalHard cap of 200 rooms per request; per-tenant rate limit on bulk; transactional all-or-nothingLowService lead
R-25Improper handling of tenant.deleted.v1 (loss of data before legal hold)LRegulatoryCascade soft-delete; hard purge deferred until retention window expires; audit row preserved indefinitelyLowCompliance

2. Risk Themes

2.1 Multi-tenancy

R-01, R-16, R-25. Mitigated by a layered isolation model and a budget for tenant-tier upgrades when noise dominates.

2.2 Domain accuracy in regional markets

R-02, R-03, R-06. Hotel domain quality is the customer-perceived moat; we accept Medium residual on UX-quality risks because the cost of fully removing them (e.g., per-region human review on every publish) outweighs the benefit.

2.3 Consistency with adjacent services

R-12, R-15, R-13. Mitigated by event contracts and integration tests; periodic cross-team contract reviews are mandatory.

2.4 AI assist as additive surface

R-06, R-11, R-21. The AI surface is intentionally non-essential; failures degrade UX, never block publish.

2.5 Sync robustness

R-09, R-10. Offline desktop is a first-class surface; sync risks are owned jointly by the service and desktop teams with a documented rollback playbook for desktop binaries.


3. Review Cadence

  • Quarterly review by service lead + SRE + security; updates land as PRs to this file.
  • Post-incident review automatically appends or updates a row whenever an incident maps to a register entry; if an incident has no row, a new one is opened.
  • Annual full audit with an external compliance reviewer.

4. Change Log

  • 2026-04-22 — Initial register published alongside the v1 service bundle.

Open risks of grade High residual must list mitigation owner, target date, and the linked ADR. The register is the single source of truth; do not maintain a parallel risk list elsewhere.