SERVICE_OVERVIEW — bff-consumer-service
Bundle index: SERVICE_OVERVIEW · DOMAIN_MODEL · APPLICATION_LOGIC · API_CONTRACTS · EVENT_SCHEMAS · DATA_MODEL · SYNC_CONTRACT · AI_INTEGRATION · SECURITY_MODEL · OBSERVABILITY · TESTING_STRATEGY · DEPLOYMENT_TOPOLOGY · FAILURE_MODES · LOCAL_DEV_SETUP · SERVICE_READINESS · SERVICE_RISK_REGISTER · MIGRATION_PLAN
Strategic anchors: 02 Enterprise Architecture §5 · 04 Event-Driven Architecture · 05 API Design §9.1 · 06 Data Models · 07 Security/Compliance/Tenancy · Standards · NAMING · Standards · ERROR_CODES
1. Purpose
bff-consumer-service is the Backend-for-Frontend that powers the consumer meta layer of Ghasi Melmastoon — the cross-tenant Trivago-like discovery surface delivered by the Next.js web app (@ghasi/app-web-meta) and the React Native consumer mobile app (@ghasi/app-mobile-consumer). It exists to answer a single, narrow architectural question:
Where does anonymous traffic land, get composed, get cached, get rate-limited, and get handed off into a tenant booking flow — without ever leaking anything more than published, non-PII, cross-tenant search-projection fields?
Three properties make this BFF necessary and irreducible:
- Anonymity boundary. The consumer surface is the only Melmastoon entry point that does not assume a tenant context, does not assume a JWT, and does not speak to authenticated services. Folding it into another BFF would either pollute the anonymous shape with staff fields or pollute staff/booking shapes with cross-tenant facets.
- Cross-tenant composition. Listings, map pins, and brand peeks span tenants by design. The platform forbids cross-tenant reads anywhere except
search-aggregation-serviceand a few elevated paths (02 §6.3). This BFF is the only consumer-facing surface allowed to fan out across that projection. - Cache and stampede economics. Marketing campaigns, SEO surges, and viral traffic create a 10× spike profile that would melt internal services if proxied raw. This BFF concentrates cache, single-flight, and bot mitigation in a single tier so internal services see steady-state load.
This service owns no domain state, performs no domain mutations, and emits no domain events. Its only writes are session blobs in Memorystore, telemetry rows in its tiny Postgres outbox, anonymous wishlist mirrors, signed-handoff replay-protection records, and bot-score logs.
2. Bounded context
Context name: BFF · Meta / Discovery Domain class: Supporting (the differentiator is not in being a BFF; it is in the cache discipline, bot mitigation, signed-handoff trust boundary, and the conversion-funnel telemetry shape) Ubiquitous language: GuestSession, SearchSession, RecentlyViewed, Wishlist, BookingHandoff, MetaPageView, ConversionFunnelEvent, LocalePreference, CurrencyPreference, BrandPeek, ListingCardVM (view-model), HotelDetailVM, MapPinVM, FacetCatalog, HandoffToken, BotScore, StampedeLock, RatePreviewSnapshot.
What is in:
- View-model composition for
/search,/search/map,/hotels/{id},/hotels/{id}/availability,/wishlist,/handoff,/session. - Anonymous
GuestSessionlifecycle, capped recently-viewed and wishlist, locale and currency preference state. - HMAC-signed
BookingHandofftoken mint + replay-protection ledger. - Conversion-funnel telemetry emission via the per-service outbox (no domain event emission).
- Memorystore-backed cache, single-flight (stampede) protection, Cloud-CDN cache-control headers.
- Bot detection (UA, fingerprint, cadence, behavioural) and CAPTCHA hand-off.
- Rate limiting (per-IP, per-cookie, per-fingerprint) integrated with Cloud Armor at the edge.
What is out:
- Cross-tenant reads from authoritative services (
reservation-service,inventory-service,pricing-servicewrite side,billing-service). Forbidden — we read only fromsearch-aggregation-serviceprojection, plus narrow read-onlypricing-service /quotes/previewandproperty-service /properties/{id}calls. - Authenticated user state. Phase 1 is anonymous-only. Phase 2 may upgrade to
iam-serviceconsumer accounts; that surface and its state will be added behind a feature flag, not woven into the anonymous path. - The booking flow. When the guest hits "Book", we mint a handoff and redirect to
bff-tenant-booking-service. We never compose room/rate/payment screens. - The tenant brand definition. We display a brand peek; the source of truth lives in
theme-config-service. - Email / SMS / push. Anonymous consumers receive no transactional notifications from this BFF.
3. Aggregates owned
| Aggregate | Cardinality | Purpose | Identity prefix | Storage |
|---|---|---|---|---|
GuestSession | 1 per gms_ cookie | Anonymous session blob (locale, currency, recently viewed, wishlist refs) | gms_ | Memorystore (Redis) |
SearchSession | 1 per active query | Active query + filter context | srs_ | Memorystore (Redis), TTL 1 h |
Wishlist (mirror) | 1 per session | Anonymous wishlist mirror | wsh_ | Postgres wishlist_anonymous |
BookingHandoff | short-lived | Signed handoff record, single-use | bhd_ | Postgres handoff_replay_log (TTL 30 min) |
MetaPageView | append-only | Page-view ledger | mpv_ | Postgres outbox (analytics outbox) |
ConversionFunnelEvent | append-only | Funnel-step ledger | cfe_ | Postgres outbox (analytics outbox) |
BotScore | append-only | Bot-detector verdicts | (composite) | Postgres bot_score_log, 7-day TTL |
BrandPeek (cache) | per tenant | Logo + primary color + slug | (composite) | Memorystore (Redis), TTL 15 min |
GuestSession.tenantId is always null — this BFF is cross-tenant by design. tenantId is non-null only on a transient BookingHandoff row (carries the target tenant) and on a BrandPeek row (cache key includes tenantId).
4. Responsibilities (numbered)
- Anonymous session bootstrap. On first request, mint a
gms_<ulid>cookie (HttpOnly,Secure,SameSite=Lax, 30-day TTL), persistGuestSessionin Memorystore, emitmelmastoon.bff.consumer.session.started.v1. - Search composition. Translate
(geo, dates, occupancy, filters, sortKey, locale, currency)into asearch-aggregation-servicequery. Enrich the top-N results with cheapest-rate snapshots frompricing-service's read-only preview endpoint. Enrich brand peek (logo + primary color) fromtheme-config-service. Return a flat list-view model. - Map composition. Same query in bounding-box mode. Returns up to 250 lightweight pins; only the cursor-targeted pin carries a rate snapshot.
- Hotel detail composition. Parallel fanout to
property-service(rooms/amenities/photos/policies),search-aggregation-service(popularity + review summary),pricing-service(cheapest rate + 7-day calendar preview),theme-config-service(brand peek). Compose, cache, return. - Light availability.
/hotels/{id}/availability?from&toreads the lightweight projection fromsearch-aggregation-service(already pre-aggregated for the meta layer). The BFF never queriesinventory-servicedirectly. - Wishlist management. Cookie-keyed add/remove/list, capped at 100 entries, stored in Memorystore session blob and mirrored to
wishlist_anonymous(Postgres) so a future authenticated upgrade can merge. - Handoff minting. HMAC-sign a
BookingHandofftoken containing target tenantId, propertyId, dates, occupancy, locale, currency, sourceCampaign, expiresAt; record inhandoff_replay_log; return the redirect URL targetinghttps://{tenantSlug}.melmastoon.ghasi.io/book?h=<token>. - Telemetry emission. Every funnel step (
session.started,search.executed,click.recorded,handoff.initiated,wishlist.added/removed,locale.changed,currency.changed,bot_suspected) is appended to the analytics outbox and drained to Pub/Sub. Sampling per EVENT_SCHEMAS.md. - Bot mitigation. UA pattern matching, fingerprint hashing, request-cadence buckets (token bucket per cookie + IP + fingerprint), suspicious-pattern CAPTCHA challenge via Cloud reCAPTCHA Enterprise, soft-deny with
MELMASTOON.BFF.CONSUMER.SUSPECTED_BOT. - Locale + currency propagation. Read
Accept-LanguageandX-Currencyheaders (or session preference); propagate tosearch-aggregation-serviceandpricing-servicequeries; never override an explicit user preference. - Cache + stampede control. Memorystore-keyed cache for search results (TTL 60 s), hotel detail (TTL 5 min), brand peek (TTL 15 min), facet catalog (TTL 1 h). Single-flight lock via Redis
SET NX EXper cache key with 5 s lock TTL; followers wait up to 4 s for the leader to populate, then fall back to direct fetch with circuit-breaker awareness. - Cache invalidation. Subscribe to
melmastoon.theme.published.v1,melmastoon.search_aggregation.listing.indexed.v1,melmastoon.tenant.suspended.v1. Invalidate the corresponding Memorystore keys. - Edge CDN integration. For stable
/facetsand/hotels/{id}responses, setCache-Control: public, max-age=15, s-maxage=300, stale-while-revalidate=60andVary: Accept-Language, X-Currency. Cloud CDN absorbs ~90% of cold reads. - Cross-region degradation. When a property's region is far from the request region, mark the listing card with
crossRegionDelivery: trueso the client can show an "international booking" badge; this is purely a presentation hint, not an authorization decision. - Marketing-campaign mode.
bff.consumer.campaign_mode.v1toggle (configurable via Firebase Remote Config) raises/searchcache TTL to 5 min and lowers/handoffrate-limit slightly to absorb the spike. - Tenant suspension awareness. When
melmastoon.tenant.suspended.v1fires, the BFF immediately drops the tenant from search results (cache-bust + soft-block list), and/handoffto a suspended tenant returnsMELMASTOON.BFF.CONSUMER.TENANT_SUSPENDED.
5. Upstream / downstream context map
┌────────────────────────────────────┐
│ search-aggregation-service │
│ (cross-tenant projection) │
│ ranked listings, map pins, │
│ light availability, facets │
└─────────────┬──────────────────────┘
│ REST (read)
┌───────────────────┐ │
│ pricing-service │ ────────────────┤ (read-only /quotes/preview, no write)
│ (read-only quote │ │
│ preview endpoint)│ │
└───────────────────┘ │
│
┌───────────────────┐ │
│ property-service │ ────────────────┤ (hotel detail, rooms, amenities, media URLs)
└───────────────────┘ │
│
┌───────────────────┐ │
│ theme-config-svc │ ────────────────┤ (BrandPeek: logo URL + primary color)
└───────────────────┘ │
▼
┌─────────────────────────────────────┐
│ bff-consumer-service │
│ ┌──────────────┐ ┌───────────────┐ │
│ │ Composition │→ │ Memorystore │ │
│ │ orchestrator │ │ cache + flight│ │
│ └──────┬───────┘ └───────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ ┌───────────────┐ │
│ │ Handoff │→ │ Postgres │ │
│ │ minter (HMAC)│ │ (outbox + │ │
│ └──────────────┘ │ replay log) │ │
│ └─────┬─────────┘ │
│ │ │
│ ┌──────────────┐ │ │
│ │ Bot detector │ │ │
│ └──────────────┘ ▼ │
│ ┌─────────────┐ │
│ │ Pub/Sub │ │
│ │ (telemetry) │ │
│ └──────┬──────┘ │
└─────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ analytics-service · audit-service │
│ bff-tenant-booking-service │
│ (consumes signed handoff token at │
│ /bff/tenant-booking/v1/bootstrap?h=) │
└──────────────────────────────────────────┘
6. Key decisions
| Decision | Rationale | Alternatives considered |
|---|---|---|
| No own domain database | The BFF is a stateless composition tier; persisting domain state would duplicate truth and require sync | A read-side DB was considered; rejected because Memorystore + analytics outbox is enough |
| Memorystore (Redis) for sessions | 30-day session TTL, sub-ms reads, native single-flight via SET NX EX | Firestore (rejected: too slow for hot-path); in-process LRU (rejected: doesn't survive Cloud Run scale-to-zero) |
| Postgres for analytics outbox + handoff log | Need transactional outbox for exactly-once delivery to Pub/Sub; need a small relational store for replay protection and bot-score logs | Pub/Sub direct (rejected: cannot guarantee outbox semantics); Firestore outbox (rejected: less mature ordering) |
| HMAC-signed handoff token | Cryptographically tamper-evident; stateless verification on the receiving BFF; secret rotation supported | JWT (rejected: too heavy for one-shot redirect; HMAC-SHA256 envelope is ~120 bytes); server-side opaque token + DB lookup (rejected: extra round trip on the receiving side) |
| Cross-tenant data only via search-aggregation-service | Enforces the platform's only legitimate cross-tenant read path | Direct fanout to property-service per tenant (rejected: violates tenancy model and creates N+1 latency) |
| Cloud CDN in front of Cloud Run | Absorbs the 10× campaign-spike profile | App-tier cache only (rejected: doesn't scale at the edge) |
Telemetry events under melmastoon.bff.consumer.* | Distinguishes BFF telemetry from domain events at subject-prefix level so analytics consumers can filter cleanly | Re-using melmastoon.analytics.* (rejected: muddles event ownership) |
| Anonymous-only in Phase 1 | Reduces auth scope; aligns with meta-layer use case | Authenticated consumer accounts in Phase 1 (rejected: deferred to Phase 2 to keep MVP small) |
Postgres tenant_id is null on every owned row | Cross-tenant by design; RLS not applicable | Per-tenant partitioning (rejected: meaningless at this layer) |
Idempotency on /handoff | Same Idempotency-Key returns same signed token; protects against double-mint on retry | Free-form mint per request (rejected: causes replay-log bloat) |
7. Service Level Objectives (SLOs)
| SLI | SLO | Measurement |
|---|---|---|
/search p95 latency (warm) | < 600 ms | Cloud Trace + Cloud Monitoring; per-region |
/search p99 latency (warm) | < 1100 ms | same |
/hotels/{id} p95 latency | < 500 ms | same |
/handoff p99 latency | < 250 ms | same |
| Availability (rolling 28 d) | 99.9% | Successful 2xx + 304 over total 2xx/3xx/4xx/5xx (excludes 401/403 client misuse) |
Cache hit ratio /search | ≥ 60% | Memorystore stats |
| Bot false-positive rate | < 0.5% | Sampled human-review of bot_suspected.v1 |
| Telemetry event delivery | 99.99% within 60 s | Outbox lag + Pub/Sub ack latency |
8. Capacity sizing (Phase 1)
| Resource | Steady state | Campaign peak (10×) |
|---|---|---|
| Cloud Run instances | 3 (min) | 30 (max) |
| Memorystore (Redis) | 1 GB working set | 4 GB |
| Postgres (Cloud SQL shared) | 50 IOPS, 1 vCPU | 300 IOPS, 2 vCPU |
| Pub/Sub topic publish QPS | 50 / s | 500 / s |
| Cloud CDN cache hit ratio | 75% | 90% (longer TTLs in campaign mode) |
| Egress bandwidth | 50 MB/s | 500 MB/s |
9. What success looks like
The BFF is succeeding when: (a) marketing-campaign spikes do not cause measurable load on search-aggregation-service or property-service; (b) the funnel-conversion dashboard in analytics-service is stable enough to drive product decisions; (c) zero handoff-token tampering incidents are detected; (d) zero PII leaks across tenants are detected; (e) tenant_id never appears on a row owned by this service except on BookingHandoff and BrandPeek cache keys.
The BFF is failing when: (a) consumers see stale prices > 60 s past the pricing-service ground truth; (b) bot traffic creates measurable cost amplification on internal services; (c) handoff tokens replay successfully on the receiving BFF; (d) telemetry-event lag exceeds 5 minutes; (e) the cache-hit ratio drops below 40% during normal load.