Skip to main content

maintenance-service · AI_INTEGRATION

All AI calls go through ai-orchestrator-service. This service never imports a model SDK directly. Capabilities are declarative (registered with the orchestrator) and audited. Provenance is mandatory.

1. Capability registry (this service)

Capability keyPurposeClassLatency budgetHITL
maintenance.severity-suggestion.v1Suggest WorkOrderSeverity from title + description (+ asset class)small LLM, tenant-tunable800 ms p95always — staff confirms
maintenance.category-classify.v1Classify CategoryCode from title + descriptionclassifier (small LLM with constrained output)600 ms p95conditional (auto-accept if confidence ≥ 0.85; else HITL)
maintenance.root-cause-hint.v1Produce a 1-3 sentence root-cause hint for a resolved WO from notes + historysummariser1.5 s p95none (informational)
maintenance.asset-health-forecast.v1Produce next healthIndex (0..100) for an asset given recent signalssmall regression model + LLM rationale2 s p95none (read-only signal)
maintenance.vendor-message-draft.v1Draft a WhatsApp/SMS message for a vendor based on WO context + tenant toneLLM1 s p95always — staff edits then sends
maintenance.preventive-due-digest.v1Compose a daily digest message for staff with what's due todayLLM2 s p95none (informational)

All capability keys are ULID-versioned in the orchestrator's registry; bumping the model bumps the version.

2. AIProvenance envelope

Every AI-touched record carries provenance (also published in events when relevant):

export interface AIProvenance {
readonly capability: string; // e.g. "maintenance.severity-suggestion.v1"
readonly model: string; // e.g. "vertex/text-bison@002"
readonly score: number; // 0..1, capability-defined semantics
readonly redactedInput: boolean; // true if PII redaction was applied
readonly humanAccepted: boolean;
readonly correlationId: ULID;
readonly producedAt: string; // ISO-8601 UTC
readonly costMicroUsd?: number; // orchestrator returns this
}

Stored on WorkOrder.aiProvenance (jsonb array) and on Asset.healthIndex updates (single struct).

3. Capability flows

3.1 maintenance.severity-suggestion.v1

Trigger: CreateWorkOrderUseCase when severity not provided OR when aiAssist.severitySuggest = true.

Input:

interface SeveritySuggestionInput {
tenantId: TenantId;
title: string;
description: string;
assetClass?: AssetClass;
category?: CategoryCode;
hasGuestImpact: boolean; // computed: source = guest_complaint OR overlapping reservation
}

Output:

interface SeveritySuggestionOutput {
severity: WorkOrderSeverity;
confidence: number; // 0..1
rationale: string; // 1-2 sentences
provenance: AIProvenance;
}

Use: Pre-fill the BFF form. Never auto-accepts; staff must click confirm. The chosen severity (and humanAccepted = true) is persisted on the WO.

Privacy: No PII (no guest names) is sent. Description is redacted by the orchestrator's PII filter before model call (redactedInput = true).

3.2 maintenance.category-classify.v1

Trigger: auto-create paths from housekeeping.room.maintenance_required.v1 when the upstream tag is ambiguous, and from CreateWorkOrderUseCase when category not provided.

Input:

interface CategoryClassifyInput {
tenantId: TenantId;
title: string;
description: string;
housekeepingTags?: readonly string[];
}

Output:

interface CategoryClassifyOutput {
category: CategoryCode;
confidence: number;
alternatives: ReadonlyArray<{ category: CategoryCode; confidence: number }>;
provenance: AIProvenance;
}

Decision rule: if confidence ≥ 0.85, auto-accept and persist with humanAccepted = false. Else mark the WO category = 'other' with the suggestion stored in provenance, and the BFF prompts staff to choose. Either way, downstream re-classification is allowed.

3.3 maintenance.root-cause-hint.v1

Trigger: WorkOrderResolved event (post-commit, async via outbox-driven worker). Reads the WO + last 5 resolved WOs on the same assetId to find a likely pattern.

Output: stored on WorkOrder.aiProvenance and surfaced as a tooltip in the BFF "history" tab. Never blocks anything.

3.4 maintenance.asset-health-forecast.v1

Trigger: AssetHealthForecasterWorker (hourly). Selects assets with: (a) recent run-hours updates, (b) recent lock health alerts, (c) ≥ 2 WOs in last 30 days, or (d) explicit force flag from BFF.

Input: asset snapshot + last 30 days of WorkOrder events on it + last 30 days of health_changed.v1 events.

Output: new healthIndex (0..100). If |delta| ≥ 5 we publish AssetHealthChanged with source = ai_forecaster and the provenance.

3.5 maintenance.vendor-message-draft.v1

Trigger: BFF "Notify vendor" action.

Input: WO summary + vendor display name + tenant tone preset (e.g., "formal-dari", "informal-pashto", "neutral-english").

Output: draft message, ≤ 480 chars (SMS-safe). Staff edits and sends; orchestrator can also produce a Pashto/Dari translation pair for review.

3.6 maintenance.preventive-due-digest.v1

Trigger: Cron 06:00 tenant-local. Pulls all PreventiveSchedule due in next 24 hours; LLM composes a digest grouped by category.

Output: WhatsApp/SMS-friendly message sent to GM and on-duty supervisor.

4. HITL (Human-in-the-loop) policy

CapabilityHITL?Why
severity-suggestionalwaysSeverity drives auto-OOO and re-accommodation; too costly to auto-accept
category-classifyconditional ≥ 0.85Mis-classification is recoverable; automation gain large
root-cause-hintnoneRead-only signal, no operational consequence
asset-health-forecastnoneJust a soft index; deterministic policies don't trigger off it alone
vendor-message-draftalwaysOutbound communication; legal & reputational risk
preventive-due-digestnoneInformational broadcast

When HITL is required, the orchestrator's response includes humanAcceptanceToken which the BFF must submit back on the user's confirmation. We require the token before persisting the AI value with humanAccepted=true.

5. Cost controls

  • Per-tenant monthly budget (set in tenant-service settings) for ai_orchestrator.maintenance.*. Budget remaining returned in every orchestrator response.
  • Soft cap: when at 80% budget, auto-disable vendor-message-draft (highest cost-per-call) and notify GM.
  • Hard cap: at 100%, all capabilities fail-soft (return null suggestion + warning); WOs still flow.
  • Per-call timeout 3 s; on timeout we proceed without the suggestion (fail-soft).

6. Privacy & redaction

  • PII filter (orchestrator-side): guest names, room numbers (in some markets), phone numbers stripped from descriptions before model call.
  • Vendor PII (phone, name) is not sent to the model unless explicitly used in vendor-message-draft (where it's required).
  • All inputs/outputs logged to audit-service with correlationId and 90-day retention.
  • Models hosted in Vertex AI in europe-west1 (data residency for EU tenants); for non-EU tenants, us-central1 is permitted.

7. Audit trail

Each AI call produces an entry in audit-service:

{
"service": "maintenance-service",
"capability": "maintenance.severity-suggestion.v1",
"model": "vertex/text-bison@002",
"tenantId": "tnt_...",
"subjectId": "mnt_...",
"redactedInput": true,
"score": 0.74,
"humanAccepted": true,
"decisionDurationMs": 540,
"costMicroUsd": 230,
"correlationId": "01HXY...",
"producedAt": "2026-04-22T14:03:21Z"
}

8. Failure & degradation

FailureBehaviour
Orchestrator 5xxFail-soft: WO is created without AI assist; provenance carries { humanAccepted: true, score: 0, model: 'fallback-none' }
Orchestrator timeoutFail-soft as above
Tenant out of budgetvendor-message-draft disabled; classifier and forecaster still run on cheap models if available; severity suggest disabled
Model returns invalid output (unknown enum)Discard, log, do not persist
Redaction failsReject the call; do not retry without redaction

9. Reproducibility & rollback

  • Every persisted suggestion stores model and capability version. We can replay decisions by re-running the same input against the same model in the orchestrator's "replay" mode.
  • A capability version bump never alters past WOs; new WOs use the new version. This is captured in BigQuery for cohort analysis.

10. Out of scope (v1)

  • Any AI that decides without human acceptance for severity, OOO, or vendor outbound.
  • Image/photo analysis for damage classification (planned v2; gated on a privacy review).
  • Voice-to-text on technician phone notes (planned v2; pending vendor evaluation).