Skip to main content

AI_INTEGRATION — bff-backoffice-service

Sibling: API_CONTRACTS · APPLICATION_LOGIC · DOMAIN_MODEL · SECURITY_MODEL

Cross-cutting: 08 AI Architecture · ADR-0003 §6 AI inference placement

1. Posture

This BFF makes no direct LLM calls and runs no on-device inference. It is an orchestration layer between the Electron desktop's AI surfaces (suggestion inbox, anomaly badges, "ask Ghasi" prompt) and two upstream AI runtimes:

  1. Cloud AIai-orchestrator-service for cloud-hosted models (planning, summarization, what-if forecasting).
  2. Edge AI — ONNX Runtime Node inside the desktop's Electron main process, for housekeeping reorder, anomaly heuristics, demand smoothing, image quality scoring (per ADR-0003 §6).

The BFF's role is to: list AI suggestions from the cloud orchestrator, record operator decisions with full audit, render AI provenance to the operator UI, and emit decision telemetry. Edge inference results bypass the BFF — they are written to the desktop's local outbox and replayed via the standard sync stream.

2. Capabilities surfaced (cloud)

CapabilityTriggerUpstream callDecision UX
Overbooking warningInventory + reservation driftai-orchestrator-service.fetchSuggestionsOperator can override; decision logged
Rate-change suggestionPricing engine + occupancy curvesameDecision logged + reason captured
Housekeeping reorderArrival pattern + housekeeping queuesameOne-click apply; modified delta possible
Maintenance priorityWork order + room arrival overlapsameOperator confirms or overrides
Guest special handlingGuest profile + reservationsameSuggestion only; no autonomous action
StaffingForecast + rostersameAdvisory; logs decision
Audit anomalyOperator activity + comparisonsameHigh-severity; full audit

All suggestions are advisory; nothing autonomously mutates a domain aggregate. Acceptance triggers a normal mutation proxy via /reservations/*, /housekeeping/*, etc., still owned by the operator.

3. Capabilities surfaced (edge)

Edge inference is not routed through this BFF. The desktop main process invokes ONNX Runtime Node directly. The BFF sees the result only when the desktop subsequently:

  • Fires a normal mutation proxy whose decision included edge AI input (e.g., an accepted housekeeping reorder), or
  • Replays the local ai.inference.local.completed.v1 event on next sync.

The BFF therefore renders edge-AI provenance via the same provenance envelope (with modelClass: 'edge') but plays no orchestration role.

4. Calling pattern (cloud)

class FetchAISuggestionsUseCase {
async execute(ctx: SessionContext, propertyId: PropertyId, filter: AiFilter): Promise<AiSuggestionListVm> {
const cached = await this.cache.read(`ai:inbox:${ctx.tenantId}:${propertyId}`);
if (cached && !filter.bypassCache) return cached;

const upstream = await this.aiOrchestrator.fetchSuggestions({
tenantId: ctx.tenantId,
propertyId,
operatorRole: ctx.session.primaryRole(),
categoryFilter: filter.category,
limit: filter.limit ?? 20,
});

const vm = composeAiInboxVm(upstream);
await this.cache.writeWithTtl(`ai:inbox:${ctx.tenantId}:${propertyId}`, vm, 60);
return vm;
}
}

The BFF does not mint prompts, choose models, or post-process model output. Those concerns live in ai-orchestrator-service.

5. Telemetry annotations

Every event published by this BFF for an action that involved AI carries an aiInfluence envelope:

"aiInfluence": {
"suggestionsViewed": ["sg_..."],
"suggestionsAccepted": ["sg_..."],
"suggestionsRejected": [],
"edgeInferenceUsed": false
}

This makes downstream analytics able to compare AI-influenced vs operator-only decisions for measuring uplift.

6. Failure handling

  • AI orchestrator down → AI surfaces hidden in dashboard + workbench; operator sees no banner; aiAvailable: false on bootstrap response.
  • AI orchestrator slow → 800 ms deadline; partial composition; suggestions fall through with cached-stale tag.
  • Edge inference failure → owned by desktop; never surfaces here.
  • Decision recording failure → return 503 + retry; idempotency key absorbs duplicates.
  • Notify-orchestrator-of-decision failure → recorded locally; orchestrator reconciles via inbox sync.

The default UX rule: silent degradation. AI is advisory; missing AI never blocks operator work.

7. PII handling

The BFF never sends raw PII to the orchestrator. The fetch request body carries:

  • tenantId, propertyId, operatorRole
  • category, severity, limit
  • lastSeenAt (for delta polling)
  • never operator name / email / phone; never guest names; never folio details

The orchestrator's prompt assembly is its concern; it has its own PII rules per 08 AI Architecture §5.

Decision payloads include notes (free-text). We truncate at 500 chars before persisting and at 200 chars before exporting to BigQuery. We never send notes back to the orchestrator (orchestrator does not need them for next-suggestion ranking).

8. Caching AI outputs

CacheTTLRationale
AI inbox list per (tenant, property)60 sSuggestions don't change every second
Per-suggestion detail5 minDetail is heavier; rarely re-fetched
Provenance digest per (model, modelVersion)24 hUsed for UI rendering; rarely changes

Bypass with ?bypassCache=true on GET /ai/suggestions (operator-initiated refresh).

9. Feature flags

FlagDefaultPurpose
ai.surfaces.enabled[<tenantId>]truePer-tenant kill switch
ai.suggestion.categories.enabledper-tenant listLimit visible categories
ai.cache.ttlSec60TTL override
ai.transport.preferredssePush vs polling for AI inbox updates
ai.decision.requireMfa[<category>]falseForce step-up on certain decisions

Flags loaded from bff-backoffice-flags Memorystore key with 30 s refresh.

10. Compliance

  • Provenance recorded on every suggestion: model, modelVersion, promptVersion, modelClass, signatureFingerprint. Stored 7 years in ai_decision_log.
  • Operator can always override; no autonomous mutations.
  • Sharia-compliance flag passes through to orchestrator via complianceProfile; orchestrator filters suggestions accordingly.
  • Audit lake export 7 y; satisfies regulatory queries.
  • AI usage disclosure in operator-facing UI (small "Suggested by Ghasi AI" badge with model + provenance link).

11. Performance targets

MetricTarget
GET /ai/suggestions p95 (cached)< 50 ms
GET /ai/suggestions p95 (composed)< 500 ms
POST /ai/suggestions/{id}/decide p95< 200 ms (decision recorded; orchestrator notify async)
SSE ai.new push latency p95< 200 ms from orchestrator emit

12. Sharia / regional compliance

For tenants flagged complianceProfile = sharia:

  • Pricing change suggestions filtered to remove interest-bearing payment plan recommendations.
  • Loyalty point suggestions filtered to remove gambling-style mechanics.
  • Image-quality suggestions skip review of any imagery flagged as religious-sensitive.
  • Filter applied at orchestrator; the BFF only enforces by passing complianceProfile and refusing to render unfiltered suggestions if the orchestrator response lacks the complianceFiltered: true attestation.