Skip to main content

maintenance-service · APPLICATION_LOGIC

Application layer = ports (interfaces), use cases (commands), workers (preventive cron, SLA scanner, reminder), inbox handlers (Pub/Sub push), and mappers. Composed in bootstrap/ with NestJS DI; the layer itself stays framework-agnostic.

1. Ports (interfaces the domain depends on)

// repositories
export interface WorkOrderRepository {
findById(id: WorkOrderId, tenantId: TenantId): Promise<WorkOrder | null>;
findOpenByAssetCategory(assetId: AssetId, category: CategoryCode, tenantId: TenantId): Promise<WorkOrder | null>;
list(filter: WorkOrderListFilter, tenantId: TenantId): Promise<Page<WorkOrder>>;
save(wo: WorkOrder): Promise<void>; // OCC-aware; throws OccConflict
appendOutbox(events: DomainEvent[], correlationId: ULID): Promise<void>; // same tx as save
markCausedRoomBlock(id: WorkOrderId): Promise<void>;
scanSlaBreachCandidates(now: string, batch: number): AsyncIterable<WorkOrder>;
}

export interface AssetRepository {
findById(id: AssetId, tenantId: TenantId): Promise<Asset | null>;
findByExternalRef(ref: string, tenantId: TenantId): Promise<Asset | null>;
upsertOnHealthAlert(input: AssetUpsertInput): Promise<Asset>;
save(asset: Asset): Promise<void>;
list(filter: AssetListFilter, tenantId: TenantId): Promise<Page<Asset>>;
}

export interface PartRepository {
findById(id: PartId, tenantId: TenantId): Promise<Part | null>;
decrementOnHand(id: PartId, qty: number, tenantId: TenantId): Promise<Part>; // throws PART_OUT_OF_STOCK
save(p: Part): Promise<void>;
}

export interface VendorRepository {
findById(id: VendorId, tenantId: TenantId): Promise<Vendor | null>;
list(filter: VendorListFilter, tenantId: TenantId): Promise<Page<Vendor>>;
save(v: Vendor): Promise<void>;
}

export interface PreventiveScheduleRepository {
findDueBefore(at: string, batch: number, tenantId?: TenantId): AsyncIterable<PreventiveSchedule>;
findById(id: PreventiveScheduleId, tenantId: TenantId): Promise<PreventiveSchedule | null>;
save(s: PreventiveSchedule): Promise<void>;
// Idempotency table for "fired this hour-bucket already"
recordFire(id: PreventiveScheduleId, dueAtBucketHour: string): Promise<{ inserted: boolean }>;
}

export interface InboxRepository {
alreadyProcessed(messageId: string): Promise<boolean>;
markProcessed(messageId: string, eventType: string, correlationId: ULID): Promise<void>;
}

// platform clients
export interface PropertyClient {
// we only ASK; property-service decides via event
publishRoomBlockRequest(input: RoomBlockRequest, correlationId: ULID): Promise<void>; // wraps outbox publish
publishRoomReleaseRequest(input: RoomReleaseRequest, correlationId: ULID): Promise<void>;
}

export interface ReservationClient {
findActiveOverlapping(roomId: RoomId, window: { from: string; to: string }, tenantId: TenantId)
: Promise<readonly ReservationProjection[]>;
}

export interface NotificationClient {
sendVendorAssignment(vendor: Vendor, wo: WorkOrder, correlationId: ULID): Promise<void>;
sendEscalation(target: EscalationTarget, wo: WorkOrder, reason: string, correlationId: ULID): Promise<void>;
sendPreventiveDueDigest(staff: UserId, drafts: WorkOrderId[], correlationId: ULID): Promise<void>;
}

export interface AIClient {
suggestSeverity(input: { title: string; description: string; assetClass?: AssetClass }): Promise<{ severity: WorkOrderSeverity; confidence: number; provenance: AIProvenance }>;
classifyCategory(input: { title: string; description: string }): Promise<{ category: CategoryCode; confidence: number; provenance: AIProvenance }>;
forecastAssetHealth(asset: Asset, recentEvents: AssetEventSignal[]): Promise<{ healthIndex: number; provenance: AIProvenance }>;
}

export interface IamClient {
hasRole(userId: UserId, tenantId: TenantId, roles: readonly Role[]): Promise<boolean>;
}

export interface IdGen { workOrder(): WorkOrderId; task(): MaintenanceTaskId; asset(): AssetId; usage(): PartUsageId; vendor(): VendorId; schedule(): PreventiveScheduleId; }
export interface Clock { nowIso(): string; }
export interface Logger { info(msg: string, fields?: object): void; warn(msg: string, fields?: object): void; error(msg: string, fields?: object): void; }

All adapters live in infrastructure/. Domain functions take only value inputs, never these ports. The use cases hold the orchestration.

2. Use cases (commands)

2.1 CreateWorkOrderUseCase

Inputs: tenantId, propertyId, roomId?, assetId?, category (or AI-classify), severity (or AI-suggest), title, description, source, originRef?, actor.

Steps:

  1. Authorize: actor must have role staff (or be system for auto-paths).
  2. If category missing → call AIClient.classifyCategory; if confidence ≥ 0.8 accept, else require human.
  3. If severity missing → call AIClient.suggestSeverity; record AIProvenance (humanAccepted = false).
  4. Look up Asset/Room context to validate target.
  5. Check invariant #4 (one open WO per (asset, category)): if hit, append a comment to existing one and return its ID.
  6. Compute slaTimer from tenant settings × (category, severity).
  7. domain.createWorkOrder(...) → returns new WorkOrder + WorkOrderCreated event.
  8. domain.evaluateAutoOOO(...) — if room-blocking applies, append WorkOrderRoomBlocked event and set causedRoomBlock = true.
  9. If causedRoomBlock and roomId set → call ReservationClient.findActiveOverlapping; if any → append WorkOrderRelocationRequired.
  10. Persist via WorkOrderRepository.save + appendOutbox in one transaction.
  11. Return { id, status, version, severity, category, aiProvenance? }.

Cross-cutting: entire function wrapped in OTel span maintenance.create_work_order; structured log includes tenant.id, work_order.id, source, severity, causedRoomBlock.

2.2 AssignWorkOrderUseCase

Inputs: id, assignee ({ kind: 'staff', userId } | { kind: 'vendor', vendorId }), actor, version.

Steps:

  1. Load WO with OCC version check.
  2. If vendor assignee:
    • Load vendor; reject if !active.
    • If channelPreference.primary = 'call_only' and tenant policy requires automated notify, fail with MELMASTOON.MAINTENANCE.VENDOR_CHANNEL_MISMATCH unless caller passed acknowledgeManualNotify=true.
  3. domain.assignWorkOrder(...).
  4. Persist + outbox WorkOrderAssigned (and VendorAssigned if vendor).
  5. Best-effort dispatch: enqueue notification via NotificationClient.sendVendorAssignment; failure is logged but not blocking (the outbox event still fires; notification-service has its own retries).

2.3 StartWorkOrderUseCase / BlockWorkOrderUseCase / ResumeWorkOrderUseCase

Standard guarded transitions; each persists via OCC and appends one event. Block requires non-empty reason ∈ { 'awaiting_part', 'awaiting_vendor', 'awaiting_approval', 'guest_in_room', 'other' } and optional eta ISO.

2.4 ResolveWorkOrderUseCase

Inputs: id, resolutionNote, costLines[], partsUsed[], actor, version.

Steps:

  1. Load WO; verify status = in_progress.
  2. For each partsUsed[i]:
    • PartRepository.decrementOnHand(partId, qty) — atomic; if PART_OUT_OF_STOCK is raised, abort (caller must update parts or change qty).
    • Compute unitCost, totalCost from Part.lastUnitCost.
  3. Validate costLines.currency ≡ tenant base currency (#5).
  4. domain.resolveWorkOrder(...) → returns updated WO and WorkOrderResolved event with costRollup.
  5. If causedRoomBlock is true, do not auto-release — wait for verify. (Verification is the GM's promise to staff; we do not silently un-OOO a room.)
  6. Persist + outbox.

2.5 VerifyWorkOrderUseCase

Inputs: id, actor, note?.

Steps:

  1. Authorize: IamClient.hasRole(actor, tenantId, ['gm','owner']). Reject with MELMASTOON.IAM.AUTHZ_DENIED otherwise.
  2. Load WO; verify status = resolved.
  3. domain.verifyWorkOrder(...).
  4. If causedRoomBlock true → publish PropertyClient.publishRoomReleaseRequest (outbox). Property-service flips back; housekeeping-service re-queues clean.
  5. If WO originated from preventive_schedule → load schedule, call nextPreventiveDueAt, persist, publish PreventiveCompleted.
  6. Outbox WorkOrderVerified.

2.6 CancelWorkOrderUseCase

Requires reason (free text ≤ 280). Allowed from any non-terminal. If causedRoomBlock, publish room-release.

2.7 RecordVendorAcknowledgementUseCase

Inputs: id, channel ∈ {phone, whatsapp, sms, in_person}, note, actor.

Allows staff to log "vendor confirmed they'll come at 5pm". This does not transition status; it just adds a structured field on the WO that's surfaced in the BFF and in vendor.assigned.v1 augmentation events. Useful when the vendor never replies via a digital channel.

2.8 RecordVendorInvoiceUseCase

Inputs: id, amount, invoiceNumber, issuedAt, dueAt, fileRef?, actor.

Persists invoice on WO; publishes VendorInvoiceRecorded. billing-service consumes it and creates a payable in its own ledger; on success, it publishes billing.vendor_invoice.posted.v1 which we consume to flip postedToFolio = true.

2.9 EscalateWorkOrderUseCase

Manual or worker-driven. Escalation chain comes from tenant settings (e.g., default → supervisor → gm → owner). Each hop emits WorkOrderEscalated and dispatches notification.

2.10 CreatePreventiveScheduleUseCase / UpdateScheduleUseCase / TriggerScheduleNowUseCase / CompleteScheduleUseCase

CRUD + manual fire. TriggerScheduleNow calls the scheduler worker's materialiseDraft function inline.

2.11 RegisterAssetUseCase / UpdateAssetUseCase / RecordAssetHealthUpdateUseCase

Standard CRUD plus a health-data-point endpoint. Health updates may be human-entered or AI-derived (provenance tracked).

2.12 CreateVendorUseCase / UpdateVendorUseCase

CRUD. Validation: at least one of phoneE164 / email / whatsappE164 present.

2.13 RecordPartUsageStandaloneUseCase

For staff who want to log a part used without a WO context (e.g., scrap during install). Publishes PartUsageRecorded with workOrderId = null.

3. Workers

3.1 PreventiveSchedulerWorker (Cloud Scheduler → Cloud Run, every 60 s)

Loop:

  1. PreventiveScheduleRepository.findDueBefore(now, batch=200) (per tenant for fairness).
  2. For each schedule:
    • recordFire(scheduleId, hourBucket(dueAtIso)). If inserted=false, skip (idempotency #10).
    • materialiseDraft(schedule):
      • Build CreateWorkOrderInput with source='preventive_schedule', originRef=scheduleId, defaultSeverity, defaultAssignee?.
      • Invoke CreateWorkOrderUseCase (no AI calls — schedule already has category).
    • Publish PreventiveDue (always) and WorkOrderCreated (from inner use case).
  3. Update nextDueAt projected from cadence.

3.2 SlaBreachScannerWorker (every 60 s)

for await (const wo of repo.scanSlaBreachCandidates(now, batch=500)) {
const next = domain.evaluateSlaBreach(wo, now);
if (next.events.includes('WorkOrderSlaBreached')) {
await repo.save(next.wo);
await repo.appendOutbox([slaBreachedEvent], correlationId);
await maybeAutoEscalate(next.wo); // policy-driven
}
}

3.3 VendorReminderWorker (every 5 min)

For WOs with assignee.kind='vendor', status assigned, and no vendorAcknowledgement after tenantSettings.vendorReminderMinutes (default 30):

  1. Re-send via the vendor's preferred channel.
  2. After N reminders without ack → auto-escalate.

3.4 AssetHealthForecasterWorker (hourly)

Selects assets with recent signals (run-hours updates, lock battery telemetry, repeat WOs). Calls AIClient.forecastAssetHealth; updates Asset.healthIndex; emits AssetHealthChanged if delta ≥ 5 points.

4. Inbox handlers (Pub/Sub push)

SubscriptionSource eventHandlerIdempotency
mnt.in.housekeeping.maintenance_requiredmelmastoon.housekeeping.room.maintenance_required.v1Auto-create WO with category from flag tag, source='housekeeping_flag', originRef=flagIdmessageId in inbox
mnt.in.lock.health_alertmelmastoon.lock_integration.device.health_alert.v1Upsert Asset for the device; auto-create WO category=lock; severity from battery%/onlineinbox + (deviceId, alertCode, dayBucket) natural dedupe
mnt.in.property.room_taken_ooomelmastoon.property.room.taken_out_of_order.v1Find any active WOs on the room and link OOO source-of-truth chaininbox
mnt.in.property.room_releasedmelmastoon.property.room.returned_to_service.v1Sanity check: warn if any WO still expects the room blockedinbox
mnt.in.staff.shift_startedmelmastoon.staff.shift.started.v1Refresh in-memory technician roster cachenon-side-effecting; inbox optional
mnt.in.tenant.settings_changedmelmastoon.tenant.settings.changed.v1Refresh SLA targets & escalation rules in cacheinbox
mnt.in.reservation.checked_inmelmastoon.reservation.checked_in.v1Re-evaluate active WOs on the room for in-stay impact (escalate if severity high)inbox
mnt.in.billing.vendor_invoice_postedmelmastoon.billing.vendor_invoice.posted.v1Set WorkOrder.vendorInvoice.postedToFolio = trueinbox

All handlers must:

  • read the Pub/Sub message.id and check InboxRepository.alreadyProcessed;
  • run their write inside one transaction with markProcessed;
  • never throw on duplicate (return 200 OK so Pub/Sub stops retrying).

5. Sagas

maintenance-service is not a saga orchestrator; the room-OOO and relocation flows are choreographies:

  • Room-OOO choreography: we publish work_order.room_blocked.v1; property-service decides; if accepted, it publishes room.taken_out_of_order.v1, which we consume back to confirm. If property-service rejects (room already in conflicting state), it publishes room.block_rejected.v1 which we consume to set causedRoomBlock=false and notify the staff.
  • Relocation choreography: we publish work_order.relocation_required.v1; reservation-service runs its own room_change saga and on completion publishes reservation.modified.v1 with kind=room_change. We do not block on it.
  • Vendor invoice → folio choreography: we publish vendor.invoice_recorded.v1; billing-service posts to ledger and publishes billing.vendor_invoice.posted.v1; we mark postedToFolio=true.

6. Concurrency and OCC

Every command that mutates a WorkOrder requires the caller's version. The repository implementation uses UPDATE … WHERE id = ? AND version = ? and throws OccConflict mapped to MELMASTOON.SYS.OCC_CONFLICT. The BFF retries up to 2× with re-fetch.

Asset and Schedule mutations are also OCC-protected.

7. Authorization checks (matrix)

Use caseRequired role(s)
CreateWorkOrderUseCasestaff (any role on the property)
AssignWorkOrderUseCasestaff_supervisor or gm or owner
Start/Block/Resume/Resolveassignee themselves or staff_supervisor+
VerifyWorkOrderUseCasegm or owner
CancelWorkOrderUseCasestaff_supervisor+
EscalateWorkOrderUseCasestaff_supervisor+ or system
RecordVendorInvoiceUseCasestaff_supervisor+ or accounting
Create/UpdatePreventiveScheduleUseCasegm or owner
Create/UpdateVendorUseCasegm or owner or accounting
Register/UpdateAssetUseCasestaff_supervisor+

All checks delegate to IamClient.hasRole which is multi-tenant-aware.

8. Mappers

  • domain → API DTO in mappers/api.ts
  • domain → event payload in mappers/events.ts (envelope filled by outbox relay)
  • db row → domain in mappers/persistence.ts

Mappers are pure and tested with golden snapshots.

9. Anti-patterns (will not pass review)

  • ❌ Calling property-service REST to flip room status (must go through events).
  • ❌ Reaching into reservation-service DB to find overlapping reservations (must use the projection client, which itself is event-fed).
  • ❌ Sending notifications synchronously inside the create-WO transaction (use the outbox event consumed by notification-service).
  • ❌ Auto-OOO without a severity gate.
  • ❌ Storing decimal money — must be bigint micro-units.
  • ❌ Throwing exceptions for business-rule violations (return Result).
  • ❌ Reading the system clock inside domain functions — pass now: string in.