AI_INTEGRATION — bff-backoffice-service
Sibling: API_CONTRACTS · APPLICATION_LOGIC · DOMAIN_MODEL · SECURITY_MODEL
Cross-cutting: 08 AI Architecture · ADR-0003 §6 AI inference placement
1. Posture
This BFF makes no direct LLM calls and runs no on-device inference. It is an orchestration layer between the Electron desktop's AI surfaces (suggestion inbox, anomaly badges, "ask Ghasi" prompt) and two upstream AI runtimes:
- Cloud AI —
ai-orchestrator-servicefor cloud-hosted models (planning, summarization, what-if forecasting). - Edge AI — ONNX Runtime Node inside the desktop's Electron main process, for housekeeping reorder, anomaly heuristics, demand smoothing, image quality scoring (per ADR-0003 §6).
The BFF's role is to: list AI suggestions from the cloud orchestrator, record operator decisions with full audit, render AI provenance to the operator UI, and emit decision telemetry. Edge inference results bypass the BFF — they are written to the desktop's local outbox and replayed via the standard sync stream.
2. Capabilities surfaced (cloud)
| Capability | Trigger | Upstream call | Decision UX |
|---|---|---|---|
| Overbooking warning | Inventory + reservation drift | ai-orchestrator-service.fetchSuggestions | Operator can override; decision logged |
| Rate-change suggestion | Pricing engine + occupancy curve | same | Decision logged + reason captured |
| Housekeeping reorder | Arrival pattern + housekeeping queue | same | One-click apply; modified delta possible |
| Maintenance priority | Work order + room arrival overlap | same | Operator confirms or overrides |
| Guest special handling | Guest profile + reservation | same | Suggestion only; no autonomous action |
| Staffing | Forecast + roster | same | Advisory; logs decision |
| Audit anomaly | Operator activity + comparison | same | High-severity; full audit |
All suggestions are advisory; nothing autonomously mutates a domain aggregate. Acceptance triggers a normal mutation proxy via /reservations/*, /housekeeping/*, etc., still owned by the operator.
3. Capabilities surfaced (edge)
Edge inference is not routed through this BFF. The desktop main process invokes ONNX Runtime Node directly. The BFF sees the result only when the desktop subsequently:
- Fires a normal mutation proxy whose decision included edge AI input (e.g., an accepted housekeeping reorder), or
- Replays the local
ai.inference.local.completed.v1event on next sync.
The BFF therefore renders edge-AI provenance via the same provenance envelope (with modelClass: 'edge') but plays no orchestration role.
4. Calling pattern (cloud)
class FetchAISuggestionsUseCase {
async execute(ctx: SessionContext, propertyId: PropertyId, filter: AiFilter): Promise<AiSuggestionListVm> {
const cached = await this.cache.read(`ai:inbox:${ctx.tenantId}:${propertyId}`);
if (cached && !filter.bypassCache) return cached;
const upstream = await this.aiOrchestrator.fetchSuggestions({
tenantId: ctx.tenantId,
propertyId,
operatorRole: ctx.session.primaryRole(),
categoryFilter: filter.category,
limit: filter.limit ?? 20,
});
const vm = composeAiInboxVm(upstream);
await this.cache.writeWithTtl(`ai:inbox:${ctx.tenantId}:${propertyId}`, vm, 60);
return vm;
}
}
The BFF does not mint prompts, choose models, or post-process model output. Those concerns live in ai-orchestrator-service.
5. Telemetry annotations
Every event published by this BFF for an action that involved AI carries an aiInfluence envelope:
"aiInfluence": {
"suggestionsViewed": ["sg_..."],
"suggestionsAccepted": ["sg_..."],
"suggestionsRejected": [],
"edgeInferenceUsed": false
}
This makes downstream analytics able to compare AI-influenced vs operator-only decisions for measuring uplift.
6. Failure handling
- AI orchestrator down → AI surfaces hidden in dashboard + workbench; operator sees no banner;
aiAvailable: falseon bootstrap response. - AI orchestrator slow → 800 ms deadline; partial composition; suggestions fall through with cached-stale tag.
- Edge inference failure → owned by desktop; never surfaces here.
- Decision recording failure → return 503 + retry; idempotency key absorbs duplicates.
- Notify-orchestrator-of-decision failure → recorded locally; orchestrator reconciles via inbox sync.
The default UX rule: silent degradation. AI is advisory; missing AI never blocks operator work.
7. PII handling
The BFF never sends raw PII to the orchestrator. The fetch request body carries:
tenantId,propertyId,operatorRolecategory,severity,limitlastSeenAt(for delta polling)- never operator name / email / phone; never guest names; never folio details
The orchestrator's prompt assembly is its concern; it has its own PII rules per 08 AI Architecture §5.
Decision payloads include notes (free-text). We truncate at 500 chars before persisting and at 200 chars before exporting to BigQuery. We never send notes back to the orchestrator (orchestrator does not need them for next-suggestion ranking).
8. Caching AI outputs
| Cache | TTL | Rationale |
|---|---|---|
| AI inbox list per (tenant, property) | 60 s | Suggestions don't change every second |
| Per-suggestion detail | 5 min | Detail is heavier; rarely re-fetched |
| Provenance digest per (model, modelVersion) | 24 h | Used for UI rendering; rarely changes |
Bypass with ?bypassCache=true on GET /ai/suggestions (operator-initiated refresh).
9. Feature flags
| Flag | Default | Purpose |
|---|---|---|
ai.surfaces.enabled[<tenantId>] | true | Per-tenant kill switch |
ai.suggestion.categories.enabled | per-tenant list | Limit visible categories |
ai.cache.ttlSec | 60 | TTL override |
ai.transport.preferred | sse | Push vs polling for AI inbox updates |
ai.decision.requireMfa[<category>] | false | Force step-up on certain decisions |
Flags loaded from bff-backoffice-flags Memorystore key with 30 s refresh.
10. Compliance
- Provenance recorded on every suggestion:
model,modelVersion,promptVersion,modelClass,signatureFingerprint. Stored 7 years inai_decision_log. - Operator can always override; no autonomous mutations.
- Sharia-compliance flag passes through to orchestrator via
complianceProfile; orchestrator filters suggestions accordingly. - Audit lake export 7 y; satisfies regulatory queries.
- AI usage disclosure in operator-facing UI (small "Suggested by Ghasi AI" badge with model + provenance link).
11. Performance targets
| Metric | Target |
|---|---|
GET /ai/suggestions p95 (cached) | < 50 ms |
GET /ai/suggestions p95 (composed) | < 500 ms |
POST /ai/suggestions/{id}/decide p95 | < 200 ms (decision recorded; orchestrator notify async) |
SSE ai.new push latency p95 | < 200 ms from orchestrator emit |
12. Sharia / regional compliance
For tenants flagged complianceProfile = sharia:
- Pricing change suggestions filtered to remove interest-bearing payment plan recommendations.
- Loyalty point suggestions filtered to remove gambling-style mechanics.
- Image-quality suggestions skip review of any imagery flagged as religious-sensitive.
- Filter applied at orchestrator; the BFF only enforces by passing
complianceProfileand refusing to render unfiltered suggestions if the orchestrator response lacks thecomplianceFiltered: trueattestation.
13. Cross-links
- services/ai-orchestrator-service/ — upstream AI runtime
- docs/08-ai-architecture.md — platform AI architecture
- ADR-0003 §6 — edge inference placement
- SECURITY_MODEL — MFA gates on AI-driven decisions
- API_CONTRACTS §13–14 — AI suggestion endpoints