Skip to main content

AI_INTEGRATION — bff-consumer-service

Sibling: APPLICATION_LOGIC · API_CONTRACTS · SECURITY_MODEL

Cross-cutting: 08 AI Architecture

Posture. This BFF performs no direct AI calls in Phase 1. AI personalisation for the consumer surface (semantic ranking, LLM-driven query rewrite, cluster-bias detection) is performed inside search-aggregation-service and ai-orchestrator-service, and the results are projected into the cross-tenant index this BFF reads. We pass through the AIProvenance we receive; we never invent AI artifacts.

1. Where AI lives in this surface

CapabilityOwnerThis BFF's role
Semantic ranking of search results (pgvector + LLM-judged relevance)search-aggregation-serviceWe send (query, locale, geo) and receive a ranked list with aiProvenance per row when ranking is AI-influenced
Query rewrite ("hotel near big mosque" → geo + amenity expansion)ai-orchestrator-service (called by search-aggregation-service)We forward the rewritten interpretation back to the client as a queryRewriteHint for transparency
Auto-translation of property descriptionsai-orchestrator-service (translation invoked by theme-config-service / property-service)We receive translated strings already; we surface the translationProvenance field but never re-translate
Bot detection (LLM-augmented in Phase 2)This BFF (Phase 1 = rules + reCAPTCHA; Phase 2 = ai-orchestrator routing)Phase 1: rules-only
Personalised wishlist suggestionsDeferred to Phase 2n/a Phase 1

2. Provenance pass-through

Every AI-influenced response field carries an aiProvenance block per 08. The BFF's responsibility is:

  1. Preserve aiProvenance on every nested object that came from an AI-touching upstream.
  2. Strip aiProvenance only when explicitly required by privacy rules (none currently apply for the consumer surface).
  3. Never synthesise an aiProvenance block locally.
interface AIProvenance {
modelId: string; // e.g. 'vertex/gemini-1.5-flash'
promptTemplateId: string; // e.g. 'pmt_search_rerank_v3'
invokedAt: ISODateTime;
decisionId?: DecisionId; // dec_<ulid> when HITL gated
routedThrough: 'cloud' | 'edge';
costMicroUsd?: number;
safetyVerdict?: 'allow' | 'review' | 'block';
}

A ListingCardVM may carry aiProvenance on the name.localized.<locale> (translation), the rank position (semantic re-rank), or the description field (auto-summary). The BFF copies these fields verbatim from upstream.

3. Phase 2 plans (informational)

3.1 LLM-augmented bot detection

When Phase 2 lands, the BotDetector will optionally route ambiguous cases (score ∈ [0.5, 0.85)) to ai-orchestrator-service via the BotJudgePort for an LLM verdict. Routing rules:

  • Cloud Vertex AI for primary path; ONNX Runtime edge fallback when Cloud Run instance has EDGE_AI_AVAILABLE=true.
  • Strict budget cap: ≤ 1 LLM call per session per minute.
  • HITL: not required (the LLM is advisory; the rules engine retains final say).
  • Provenance recorded in bot_score_log.signals as kind='llm-judge' with aiProvenance block.
interface BotJudgePort {
judge(input: BotJudgeInput, ctx: RequestContext): Promise<{
verdict: 'human' | 'suspect' | 'bot';
confidence: number;
reasoning: string;
aiProvenance: AIProvenance;
}>;
}

3.2 Personalised wishlist suggestions

When Phase 2 introduces consumer accounts, an ai-orchestrator-service job will produce WishlistSuggestion[] per user. The BFF will project these into a new /wishlist/suggestions endpoint, again purely as pass-through.

A future Phase 3 capability — conversational refinement on top of /search — will require streaming via SSE and is explicitly not in Phase 1 scope. When it lands, the BFF will own the SSE channel but delegate the LLM turn to ai-orchestrator-service.

4. Moderation + safety

When AI-touched fields (translations, summaries) flow through this BFF, the upstream service has already applied the platform moderation policy via ai-orchestrator-service. The BFF performs:

  • Schema validation: reject any payload where aiProvenance.safetyVerdict === 'block' — fall back to the un-translated default value.
  • Length sanity: cap pass-through translated name, description, policies at the same byte limits as the canonical strings (e.g., name ≤ 256 bytes).
  • Locale match: ensure the translated localized.<locale> key matches a locale the user requested or their fallback chain.

5. HITL surfaces

This BFF surfaces no HITL approval to consumers. Every AI artifact reaching the consumer surface has been auto-approved or human-approved upstream. The presence of an aiProvenance.safetyVerdict === 'review' value indicates the upstream HITL is still in progress; the BFF treats those values as not yet visible and falls back to the un-AI version.

6. Cost + budget

The BFF's own AI cost is zero in Phase 1. Phase 2 budgets:

CapabilityBudget cap (per day per environment)Enforcer
LLM-augmented bot detection$25 USDai-orchestrator-service token bucket
Wishlist suggestions$5 USDSame

Exceeded budgets fall back gracefully (rules-only bot detection; no suggestions surfaced). Alert pages on-call when budget consumed > 80% by 10 a.m. local.

7. Observability

  • Every AI pass-through field carries the upstream decisionId so analysts can trace from a search-result re-rank back to the AI call in ai-orchestrator-service.
  • We emit an OpenTelemetry span attribute ai.upstream.invoked=true on responses where any nested object had aiProvenance. This makes it easy to filter "AI-touched responses" in Cloud Trace.
  • We do not double-count cost: only the originating service is the cost-attributed service for an AI call.

8. Audit + compliance

Pass-through AI responses inherit the audit position of the upstream service. No new audit entries are written by this BFF for AI artifacts. The audit-service join key is decisionId; the BFF's contribution is only that it surfaced the artifact to a guestSessionId. That linkage lives in the MetaPageView row, where the page's AI-touched fields are summarised in a aiTouchedFields[] JSON column (Phase 2; not present Phase 1).