Skip to main content

housekeeping-service — SERVICE_OVERVIEW

Bounded Context: Housekeeping (Core) · Cloud: GCP · Desktop: Electron offline-first · Phase: 0 · Naming: MELMASTOON.HOUSEKEEPING.* / melmastoon.housekeeping.*

housekeeping-service is the single authority for the room-status lifecycle and the housekeeping task lifecycle of every property a Ghasi Melmastoon tenant operates. Every checkout, every mid-stay clean, every "is room 204 ready?" question lands on this service. It is invoked synchronously from the Electron desktop board and the tenant booking surface, and asynchronously from reservation-service, staff-service, maintenance-service, and ai-orchestrator-service over Pub/Sub.

This document is the front door of the service bundle. The other 16 documents fan out from here into domain, data, contracts, ops, and risk.


1. Mission

"When a guest checks out, the room becomes ready for the next guest as fast as is operationally honest — and the front desk knows the moment that flip happens."

The service is responsible for making three numbers visible and movable on a small/medium hotel's desktop board:

  1. time_to_ready — minutes from checkout to room.status = ready.
  2. pending_tasks_at_arrival_window — number of unfinished turnover tasks within the next 90 minutes of expected check-ins.
  3. linen_runway_minutes — minutes of cleaning a property can sustain before its current linen stock blocks the next assignment.

If those three numbers move in the wrong direction, the service emits the right signals (shift.staffing_gap_detected.v1, linen.low_stock_alert.v1, task.escalated.v1) so the supervisor can intervene.

2. Bounded context

  • Context name: Housekeeping
  • Type: Core
  • Owns: room-status flag (clean, dirty, cleaning, inspected, ready, out_of_order, out_of_service); task lifecycle for cleaning/inspection; checklists; lost & found; linen light stock; housekeeping-scoped projection of staff shifts.
  • Does not own: physical room definition (property-service); reservation state (reservation-service); maintenance work orders (maintenance-service); staff identity or master shift schedule (staff-service); long-term out_of_service declaration (property-service — we only echo it).
  • Conformist toward: property-service (room IDs, room types), staff-service (staff IDs, shifts, languages, skills).
  • Customer–supplier: we are the supplier to bff-backoffice-service (board), bff-tenant-booking-service (mid-stay request ack), notification-service (alerts), analytics-service (stats), search-aggregation-service (room-readiness facet).

3. Aggregates owned

AggregateIdentityNotes
HousekeepingTaskhkt_<ulid>Root aggregate. Lifecycle, assignment, checklist binding, linen issuance, outcome.
RoomStatuscomposite key (tenant_id, property_id, room_id)Singleton per room. Last-flip audit.
CleaningChecklistchl_<ulid>Versioned template; immutable once published; bound at task assignment.
Inspectionins_<ulid>0..1 per task; pass/fail with photo evidence.
LinenInventorylin_<ulid>One per linen line per property.
LostAndFoundlaf_<ulid>0..N per room/reservation; lifecycle: recorded → matched
StaffShiftAssignmentcomposite ULIDHousekeeping projection of staff-service shift.
RoomBlockcomposite ULIDWe own cleaning/inspection/oos reasons; maintenance reason owned by maintenance-service.

Detailed invariants in DOMAIN_MODEL.md.

4. Key responsibilities

  1. Auto-create turnover tasks on melmastoon.reservation.checked_out.v1.
  2. Manual + AI-assisted assignment, drag-and-drop on the desktop board.
  3. Task progress lifecycle with append-only audit.
  4. Room-status state machine with side branches for OOO/OOS.
  5. Mid-stay cleaning scheduling and ad-hoc requests.
  6. Inspection workflow (per-tenant opt-in).
  7. Lost & found capture, match, dispose.
  8. Linen light stock with low-watermark alerts.
  9. Shift staffing-gap detection.
  10. Checklist templates (versioned, tenant-scoped).
  11. Offline-first desktop board with conflict-aware sync.
  12. Pashto/Dari/Persian-first UI strings (i18n owned by the client; this service stores locale_hint on tasks for downstream notifications).

5. Where it sits in the topology

┌────────────────────────────────────────────────┐
│ Pub/Sub │
│ reservation.checked_out.v1 │
│ reservation.early_checkout.v1 │
│ reservation.modification.requested.v1 │
│ maintenance.work_order.completed.v1 │
│ staff.shift.started.v1 / .ended.v1 │
│ ai_orchestrator.suggestion.housekeeping_*.v1 │
└─────────────┬──────────────────────────────────┘
│ push (OIDC)

bff-backoffice ──REST──▶ ┌───────────────────────────────┐ ──REST──▶ ai-orchestrator
│ housekeeping-service │ ──REST──▶ property-service (read room)
bff-tenant-booking ─REST▶ │ (Cloud Run, 2..N replicas) │ ──REST──▶ staff-service (read staff)
└────────────┬──────────────────┘
│ outbox → Pub/Sub
┌───────────┴───────────────────────────┐
▼ ▼
housekeeping.task.*.v1 housekeeping.room.maintenance_required.v1
housekeeping.room.status_changed.v1 housekeeping.linen.low_stock_alert.v1
housekeeping.inspection.*.v1 housekeeping.shift.staffing_gap_detected.v1
housekeeping.lost_item.*.v1 housekeeping.checklist.template_updated.v1

6. Hotel-specific design notes

  • Small/medium properties (10–80 rooms, 2–6 housekeepers/shift) dominate our target market. The service does not assume per-staff smartphones; the desktop board is the source of truth.
  • Routing optimization matters operationally: walking from floor 4 to floor 1 and back to floor 4 wastes 5–8 minutes per swap. The router groups tasks by floor/wing.
  • Linen is sometimes issued per task (towels, sheets counted out at the linen closet). We track issued + returned counts; mismatches feed analytics but do not block completion.
  • Pashto/Dari/Persian language is the working language for housekeeping in many target properties. We persist locale_hint so notifications and printed checklists go out in the right language.
  • Configurable checklists per tenant (turnover vs deep clean vs maintenance check vs post-renovation). Checklists are versioned; once a task is assigned, the checklist version is frozen.
  • Check-in priority queue. When the front desk has a guest waiting and the room is not yet ready, the desk can press "needs now" — the task is bumped to priority=urgent and an escalated event is published if it doesn't move within 5 minutes.
  • Offline operation is common. Some staff areas have no Wi-Fi. The desktop allows status flips and task completion against local SQLite; sync resolves conflicts per the policy table in SYNC_CONTRACT.md.

7. Storage at a glance

  • Cloud SQL Postgres (regional, HA) — shared schema with tenant_id RLS on every table.
  • Monthly partitioning on housekeeping_tasks (by created_at); partition pruning enforced on hot reads.
  • Outbox in the same Postgres instance (ACID with the writes); a sidecar relay drains to Pub/Sub.
  • Inbox for idempotent consumption of upstream events (dedupe key (topic, message_id)).
  • No long-term object storage owned here — inspection photos are uploaded directly to the central GCS via signed URLs minted by media-service.

8. APIs

REST under /api/v1/housekeeping/*. Versioning, errors (RFC 7807 with MELMASTOON.HOUSEKEEPING.* codes), pagination, idempotency — see API_CONTRACTS.md.

Internal event-handler endpoints under /internal/events/*, authenticated by Pub/Sub OIDC pushed-token verification.

Sync endpoints under /sync/v1/*, callable only from authenticated Electron desktop sessions; details in SYNC_CONTRACT.md.

9. Events

20 published topics, 9 consumed topics. Full schemas, JSON examples, retention, and DLQ policy in EVENT_SCHEMAS.md.

10. Non-functional requirements

NFRTargetVerified by
Turnover task auto-create latency< 2 s p95 (event in → task row + outbox row committed)turnover-saga.spec.ts, k6 weekly
Board read p99< 250 ms for 200 active roomsk6 weekly + Datadog SLO
Board write p99 (drag-drop reassign)< 400 ms incl. outbox commitk6 weekly
Availability99.9% monthlySLO budget in slo.yaml
Tenant isolationRLS verified on every tabletenant-isolation.spec.ts
Cold-start (Cloud Run)< 1.5 s (min instances = 2)deploy gate
Replicas2..N (autoscale on CPU + concurrency)cloudrun.yaml

11. Operational topology

  • Hot path service: Cloud Run, min=2, max=20 (autoscale on concurrency=80), region asia-south1 primary, asia-southeast1 warm-standby (DR).
  • Shift-staffing-gap worker: Cloud Run Job, cron every 60 s.
  • Lost-and-found auto-dispose worker: Cloud Run Job, cron daily at 03:00 tenant TZ (per-tenant fan-out).
  • Outbox relay: sidecar in the hot-path container (lightweight goroutine-style poller in Node).

Detail in DEPLOYMENT_TOPOLOGY.md.

12. Where to go next

You want to know…Read
Domain types and invariantsDOMAIN_MODEL.md
Use-cases and orchestrationAPPLICATION_LOGIC.md
REST contractsAPI_CONTRACTS.md
Event schemasEVENT_SCHEMAS.md
Postgres DDLDATA_MODEL.md
Desktop syncSYNC_CONTRACT.md
AI routing portAI_INTEGRATION.md
AuthZ + RLSSECURITY_MODEL.md
Logs / metrics / tracesOBSERVABILITY.md
TestsTESTING_STRATEGY.md
Cloud Run topologyDEPLOYMENT_TOPOLOGY.md
Failure scenariosFAILURE_MODES.md
Local devLOCAL_DEV_SETUP.md
Readiness checklistSERVICE_READINESS.md
Open risksSERVICE_RISK_REGISTER.md
Migration policyMIGRATION_PLAN.md
Public summary../../docs/03-microservices/housekeeping-service.md