property-service — SERVICE_RISK_REGISTER

Companion: SERVICE_READINESS · SECURITY_MODEL · FAILURE_MODES · SYNC_CONTRACT · AI_INTEGRATION

This is the binding risk register for property-service. Each risk has an owner, an exposure scoring (likelihood × impact), the current mitigations in place, and the residual exposure after mitigation. Any risk graded High residual must reference a Linear ticket and an ADR if mitigation requires architectural change.

Likelihood: L (low ≤ 5%/qtr), M (med 5–25%), H (high > 25%). Impact: tenant / cross-tenant / consumer / regulatory.

1. Register

#	Risk	L	Impact	Mitigation	Residual	Owner
R-01	Cross-tenant data exposure via missing or regressed RLS policy	L	Cross-tenant + regulatory	4-layer isolation (domain, app, DB RLS `FORCE`d, outbox guard); nightly audit job; integration tests per table	Low	Service lead + Security
R-02	Property published with stale or incorrect amenities (regional sensitivity, e.g., missing `halal_kitchen`)	M	Tenant reputation	HITL accept on AI suggestions; canonical amenity registry; UX shows regional pack hints; operator QA checklist before publish	Medium	Product
R-03	Multi-language field rendering bug (mixed RTL/LTR)	M	Tenant + consumer	First-class i18n in DTOs; bidi fixtures in tests; native-script fields stored separately; visual regression via Playwright on BFF	Medium	Frontend platform
R-04	Geocoding inaccuracy → wrong pin on consumer meta map	M	Consumer	Confidence threshold from `geo-service`; AI fallback gated on operator opt-in + visual confirmation; map preview before publish	Medium	Service lead
R-05	Photo upload of inappropriate content	L	Consumer + regulatory	Scanner via `file-storage-service`; photos start `uploaded` and gate on `clean` verdict; audit + operator-attributed event	Low	Service lead
R-06	AI generates culturally inappropriate description (regional moderation fail)	L	Tenant + consumer + regulatory	Strict moderation in orchestrator with regional rule pack (Pashto/Dari/Persian/Tajik); HITL accept always required; staged-only persistence	Low	AI platform
R-07	Outbox publisher stall causes prolonged downstream drift	L	Tenant + consumer	Lag SLO + alert; backpressure-aware publisher; load-tested 10k backlog drain	Low	SRE
R-08	Pub/Sub regional outage exceeds outbox retention	L	Tenant + consumer	7-day retention on DLQ; outbox table sized for ≥ 24 h backlog; documented manual replay path; cross-region fallback runbook	Low	SRE
R-09	Sync conflict storm after a buggy desktop release	M	Tenant (operator UX)	`lww+diff` is bounded by state-machine validation; per-device throttling; conflict-rate alert; rollback playbook for desktop binaries	Medium	Desktop + Service lead
R-10	Sync cursor regression after a service deploy	L	Tenant (offline data)	Canary + auto-rollback; cursor monotonicity test in CI; documented full-reset fallback for clients	Low	SRE
R-11	Long-tail AI quota burn from a single tenant (cost spike)	M	Internal cost	Per-capability daily caps; per-tenant monthly quota; quota dashboard; alert on quota exhaustion spikes	Low	Finance + Service lead
R-12	RoomType / Room status semantic drift between this service, `housekeeping-service`, and `inventory-service`	M	Operator confusion + booking errors	Bounded-context contracts published; cross-service E2E tests; periodic invariant review across teams	Medium	Architecture
R-13	Lock device binding mismatch (room ↔ device id)	L	Operator + safety	Single-binding constraint; conflict event rejected; lock-integration owns device truth	Low	Lock-integration team
R-14	GDPR erasure incompleteness for guest-likeness photos	L	Regulatory	`containsGuestLikeness=true` flagging at upload; subject-erasure consumer archives flagged photos; DPIA on file	Medium	Compliance
R-15	Property archive while hidden reservations remain (consistency)	L	Tenant + guest	Archive precondition checks `reservation-service` port; archive blocked otherwise; alert if `reservation-service` port times out	Low	Service lead
R-16	Hot tenant skew (one tenant dominates writes/storage)	M	Internal cost + neighbor noise	Per-tenant cost panel; shared-tenant tier vs dedicated tier (hybrid model); per-tenant rate limits	Medium	Platform
R-17	Schema migration failure in production	L	Tenant (write availability)	Expand → backfill → contract; Cloud Run Job runs migrations pre-traffic; auto-rollback on failure; paired `down.sql` reviewed at PR	Low	SRE
R-18	Memorystore outage degrades read SLO	L	Tenant + consumer	Cache-miss fall-through; Cloud SQL sized for 3× read load; alert wired	Low	SRE
R-19	OPA bundle stale → policy decisions diverge	L	Operator (denied valid actions)	30-min freshness check + alert; service holds last-good bundle 1 h; coordinated bundle releases	Low	IAM team
R-20	Vendor lock-in to PostGIS for geo capability	L	Strategic	Geo abstracted behind a port; bbox/nearby tested via the abstraction; alternate provider feasible	Low	Architecture
R-21	AI orchestrator response payload drift	L	Operator (specific capability fails)	Schema validation per capability; alert on schema violations; capability disable flag per tenant	Low	AI platform
R-22	Outbox table growth unbounded if publisher disabled	L	Internal (storage cost)	Publisher health alert + automatic ticket; retention policy after publish (rows older than 30 days archived)	Low	SRE
R-23	Unbounded photo count per property → DOS via storage	L	Internal cost	Per-property photo cap (200) enforced at API; per-tenant total cap; quota events surfaced to billing	Low	Service lead
R-24	Misuse of bulk room create (e.g., 10k rooms)	L	Internal	Hard cap of 200 rooms per request; per-tenant rate limit on bulk; transactional all-or-nothing	Low	Service lead
R-25	Improper handling of `tenant.deleted.v1` (loss of data before legal hold)	L	Regulatory	Cascade soft-delete; hard purge deferred until retention window expires; audit row preserved indefinitely	Low	Compliance

2. Risk Themes

2.1 Multi-tenancy

R-01, R-16, R-25. Mitigated by a layered isolation model and a budget for tenant-tier upgrades when noise dominates.

2.2 Domain accuracy in regional markets

R-02, R-03, R-06. Hotel domain quality is the customer-perceived moat; we accept Medium residual on UX-quality risks because the cost of fully removing them (e.g., per-region human review on every publish) outweighs the benefit.

2.3 Consistency with adjacent services

R-12, R-15, R-13. Mitigated by event contracts and integration tests; periodic cross-team contract reviews are mandatory.

2.4 AI assist as additive surface

R-06, R-11, R-21. The AI surface is intentionally non-essential; failures degrade UX, never block publish.

2.5 Sync robustness

R-09, R-10. Offline desktop is a first-class surface; sync risks are owned jointly by the service and desktop teams with a documented rollback playbook for desktop binaries.

3. Review Cadence

Quarterly review by service lead + SRE + security; updates land as PRs to this file.
Post-incident review automatically appends or updates a row whenever an incident maps to a register entry; if an incident has no row, a new one is opened.
Annual full audit with an external compliance reviewer.

4. Change Log

2026-04-22 — Initial register published alongside the v1 service bundle.

Open risks of grade High residual must list mitigation owner, target date, and the linked ADR. The register is the single source of truth; do not maintain a parallel risk list elsewhere.

1. Register​

2. Risk Themes​

2.1 Multi-tenancy​

2.2 Domain accuracy in regional markets​

2.3 Consistency with adjacent services​

2.4 AI assist as additive surface​

2.5 Sync robustness​

3. Review Cadence​

4. Change Log​