housekeeping-service — SERVICE_OVERVIEW
Bounded Context: Housekeeping (Core) · Cloud: GCP · Desktop: Electron offline-first · Phase: 0 · Naming:
MELMASTOON.HOUSEKEEPING.*/melmastoon.housekeeping.*
housekeeping-service is the single authority for the room-status lifecycle and the housekeeping task lifecycle of every property a Ghasi Melmastoon tenant operates. Every checkout, every mid-stay clean, every "is room 204 ready?" question lands on this service. It is invoked synchronously from the Electron desktop board and the tenant booking surface, and asynchronously from reservation-service, staff-service, maintenance-service, and ai-orchestrator-service over Pub/Sub.
This document is the front door of the service bundle. The other 16 documents fan out from here into domain, data, contracts, ops, and risk.
1. Mission
"When a guest checks out, the room becomes ready for the next guest as fast as is operationally honest — and the front desk knows the moment that flip happens."
The service is responsible for making three numbers visible and movable on a small/medium hotel's desktop board:
time_to_ready— minutes from checkout toroom.status = ready.pending_tasks_at_arrival_window— number of unfinished turnover tasks within the next 90 minutes of expected check-ins.linen_runway_minutes— minutes of cleaning a property can sustain before its current linen stock blocks the next assignment.
If those three numbers move in the wrong direction, the service emits the right signals (shift.staffing_gap_detected.v1, linen.low_stock_alert.v1, task.escalated.v1) so the supervisor can intervene.
2. Bounded context
- Context name: Housekeeping
- Type: Core
- Owns: room-status flag (
clean,dirty,cleaning,inspected,ready,out_of_order,out_of_service); task lifecycle for cleaning/inspection; checklists; lost & found; linen light stock; housekeeping-scoped projection of staff shifts. - Does not own: physical room definition (
property-service); reservation state (reservation-service); maintenance work orders (maintenance-service); staff identity or master shift schedule (staff-service); long-termout_of_servicedeclaration (property-service— we only echo it). - Conformist toward:
property-service(room IDs, room types),staff-service(staff IDs, shifts, languages, skills). - Customer–supplier: we are the supplier to
bff-backoffice-service(board),bff-tenant-booking-service(mid-stay request ack),notification-service(alerts),analytics-service(stats),search-aggregation-service(room-readiness facet).
3. Aggregates owned
| Aggregate | Identity | Notes |
|---|---|---|
HousekeepingTask | hkt_<ulid> | Root aggregate. Lifecycle, assignment, checklist binding, linen issuance, outcome. |
RoomStatus | composite key (tenant_id, property_id, room_id) | Singleton per room. Last-flip audit. |
CleaningChecklist | chl_<ulid> | Versioned template; immutable once published; bound at task assignment. |
Inspection | ins_<ulid> | 0..1 per task; pass/fail with photo evidence. |
LinenInventory | lin_<ulid> | One per linen line per property. |
LostAndFound | laf_<ulid> | 0..N per room/reservation; lifecycle: recorded → matched |
StaffShiftAssignment | composite ULID | Housekeeping projection of staff-service shift. |
RoomBlock | composite ULID | We own cleaning/inspection/oos reasons; maintenance reason owned by maintenance-service. |
Detailed invariants in DOMAIN_MODEL.md.
4. Key responsibilities
- Auto-create turnover tasks on
melmastoon.reservation.checked_out.v1. - Manual + AI-assisted assignment, drag-and-drop on the desktop board.
- Task progress lifecycle with append-only audit.
- Room-status state machine with side branches for OOO/OOS.
- Mid-stay cleaning scheduling and ad-hoc requests.
- Inspection workflow (per-tenant opt-in).
- Lost & found capture, match, dispose.
- Linen light stock with low-watermark alerts.
- Shift staffing-gap detection.
- Checklist templates (versioned, tenant-scoped).
- Offline-first desktop board with conflict-aware sync.
- Pashto/Dari/Persian-first UI strings (i18n owned by the client; this service stores
locale_hinton tasks for downstream notifications).
5. Where it sits in the topology
┌────────────────────────────────────────────────┐
│ Pub/Sub │
│ reservation.checked_out.v1 │
│ reservation.early_checkout.v1 │
│ reservation.modification.requested.v1 │
│ maintenance.work_order.completed.v1 │
│ staff.shift.started.v1 / .ended.v1 │
│ ai_orchestrator.suggestion.housekeeping_*.v1 │
└─────────────┬──────────────────────────────────┘
│ push (OIDC)
▼
bff-backoffice ──REST──▶ ┌───────────────────────────────┐ ──REST──▶ ai-orchestrator
│ housekeeping-service │ ──REST──▶ property-service (read room)
bff-tenant-booking ─REST▶ │ (Cloud Run, 2..N replicas) │ ──REST──▶ staff-service (read staff)
└────────────┬──────────────────┘
│ outbox → Pub/Sub
┌───────────┴───────────────────────────┐
▼ ▼
housekeeping.task.*.v1 housekeeping.room.maintenance_required.v1
housekeeping.room.status_changed.v1 housekeeping.linen.low_stock_alert.v1
housekeeping.inspection.*.v1 housekeeping.shift.staffing_gap_detected.v1
housekeeping.lost_item.*.v1 housekeeping.checklist.template_updated.v1
6. Hotel-specific design notes
- Small/medium properties (10–80 rooms, 2–6 housekeepers/shift) dominate our target market. The service does not assume per-staff smartphones; the desktop board is the source of truth.
- Routing optimization matters operationally: walking from floor 4 to floor 1 and back to floor 4 wastes 5–8 minutes per swap. The router groups tasks by floor/wing.
- Linen is sometimes issued per task (towels, sheets counted out at the linen closet). We track issued + returned counts; mismatches feed analytics but do not block completion.
- Pashto/Dari/Persian language is the working language for housekeeping in many target properties. We persist
locale_hintso notifications and printed checklists go out in the right language. - Configurable checklists per tenant (turnover vs deep clean vs maintenance check vs post-renovation). Checklists are versioned; once a task is assigned, the checklist version is frozen.
- Check-in priority queue. When the front desk has a guest waiting and the room is not yet
ready, the desk can press "needs now" — the task is bumped topriority=urgentand anescalatedevent is published if it doesn't move within 5 minutes. - Offline operation is common. Some staff areas have no Wi-Fi. The desktop allows status flips and task completion against local SQLite; sync resolves conflicts per the policy table in
SYNC_CONTRACT.md.
7. Storage at a glance
- Cloud SQL Postgres (regional, HA) — shared schema with
tenant_idRLS on every table. - Monthly partitioning on
housekeeping_tasks(bycreated_at); partition pruning enforced on hot reads. - Outbox in the same Postgres instance (ACID with the writes); a sidecar relay drains to Pub/Sub.
- Inbox for idempotent consumption of upstream events (dedupe key
(topic, message_id)). - No long-term object storage owned here — inspection photos are uploaded directly to the central GCS via signed URLs minted by
media-service.
8. APIs
REST under /api/v1/housekeeping/*. Versioning, errors (RFC 7807 with MELMASTOON.HOUSEKEEPING.* codes), pagination, idempotency — see API_CONTRACTS.md.
Internal event-handler endpoints under /internal/events/*, authenticated by Pub/Sub OIDC pushed-token verification.
Sync endpoints under /sync/v1/*, callable only from authenticated Electron desktop sessions; details in SYNC_CONTRACT.md.
9. Events
20 published topics, 9 consumed topics. Full schemas, JSON examples, retention, and DLQ policy in EVENT_SCHEMAS.md.
10. Non-functional requirements
| NFR | Target | Verified by |
|---|---|---|
| Turnover task auto-create latency | < 2 s p95 (event in → task row + outbox row committed) | turnover-saga.spec.ts, k6 weekly |
| Board read p99 | < 250 ms for 200 active rooms | k6 weekly + Datadog SLO |
| Board write p99 (drag-drop reassign) | < 400 ms incl. outbox commit | k6 weekly |
| Availability | 99.9% monthly | SLO budget in slo.yaml |
| Tenant isolation | RLS verified on every table | tenant-isolation.spec.ts |
| Cold-start (Cloud Run) | < 1.5 s (min instances = 2) | deploy gate |
| Replicas | 2..N (autoscale on CPU + concurrency) | cloudrun.yaml |
11. Operational topology
- Hot path service: Cloud Run, min=2, max=20 (autoscale on concurrency=80), region
asia-south1primary,asia-southeast1warm-standby (DR). - Shift-staffing-gap worker: Cloud Run Job, cron every 60 s.
- Lost-and-found auto-dispose worker: Cloud Run Job, cron daily at 03:00 tenant TZ (per-tenant fan-out).
- Outbox relay: sidecar in the hot-path container (lightweight goroutine-style poller in Node).
Detail in DEPLOYMENT_TOPOLOGY.md.
12. Where to go next
| You want to know… | Read |
|---|---|
| Domain types and invariants | DOMAIN_MODEL.md |
| Use-cases and orchestration | APPLICATION_LOGIC.md |
| REST contracts | API_CONTRACTS.md |
| Event schemas | EVENT_SCHEMAS.md |
| Postgres DDL | DATA_MODEL.md |
| Desktop sync | SYNC_CONTRACT.md |
| AI routing port | AI_INTEGRATION.md |
| AuthZ + RLS | SECURITY_MODEL.md |
| Logs / metrics / traces | OBSERVABILITY.md |
| Tests | TESTING_STRATEGY.md |
| Cloud Run topology | DEPLOYMENT_TOPOLOGY.md |
| Failure scenarios | FAILURE_MODES.md |
| Local dev | LOCAL_DEV_SETUP.md |
| Readiness checklist | SERVICE_READINESS.md |
| Open risks | SERVICE_RISK_REGISTER.md |
| Migration policy | MIGRATION_PLAN.md |
| Public summary | ../../docs/03-microservices/housekeeping-service.md |