Skip to main content

search-aggregation-service

Bounded Context: Discovery (Core) · Owner: Discovery · Phase: 1 · Storage: Cloud SQL Postgres 15 + PostGIS (canonical projection) + OpenSearch 2.x (full-text + geo + facet read index) + Memorystore Redis 7 (hot result cache) + pgvector (semantic ranking, Phase 2+) · Bundle: services/search-aggregation-service/

search-aggregation-service owns the cross-tenant materialized search index that powers the consumer meta-search layer (Trivago/Booking-like): a guest searches a city + dates + occupancy and sees hotels from many tenants ranked by price, distance, popularity, and rating. It maintains a hybrid Postgres + OpenSearch projection of Property (denormalized), nightly available Inventory hints (per-date counts, no per-room PII), and RatePlan cheapest-rate snapshots per (property, date, currency). It is the only service in the platform authorized to query across tenants, and only over the explicit subset of fields marked cross_tenant_searchable: true in DATA_MODEL §3. PII, payment data, key credentials, lock device serials, financial ledgers, and any non-published property are never indexed.

It is strictly read-only on the cross-tenant data model: it consumes events from upstream services (property-service, pricing-service, inventory-service, tenant-service) and projects them into its own write-side. It never calls upstream services on the read path. Booking is never completed here — selecting a hotel deep-links the consumer into the matching tenant's booking surface served by bff-tenant-booking-service.


Purpose

  • Sub-second consumer meta-search across all published properties of all tenants.
  • Hybrid index: Postgres for ACID source-of-truth projection; OpenSearch for full-text + geo + facets; Redis for hot result caches; pgvector for semantic re-ranking (Phase 2+).
  • Per-locale full-text matching (Pashto, Dari, Persian, Tajik, English, Arabic, Urdu, Russian) using language-aware analyzers.
  • Per-region surfacing — first launch restricts results to AF / TJ / IR markets via region pinning; expand globally as tenant.region opens up.
  • Boost rules and sponsored placement (Phase 3+) via BoostRule and SponsoredRanking aggregates.
  • Click-stream + query analytics signals fed back to analytics-service for ranking iteration.

Key responsibilities

  • Event-driven projection refresh from property-service, pricing-service, inventory-service, tenant-service. Last-write-wins on occurredAt with vector-clock tie-break.
  • OpenSearch index lifecycle: index templates, hot/warm rollover, ILM, reindex on mapping change, blue/green index swap with alias melmastoon-search-current.
  • Search Query API for bff-consumer-service: text + filters (dates, occupancy, price band, amenities, star rating) + sort (price asc, distance, popularity, rating) + pagination (cursor) + map-friendly bbox queries.
  • Hotel detail aggregate: a denormalized read model that returns the full property card (multi-locale name/description, hero photo, amenities, cheapest rate snapshot, available rooms hint) in a single call.
  • Cache layer (Memorystore) with key patterns srh:q:<sha256(canonical-query)> (60 s TTL) and srh:detail:<propertyId>:<currency>:<dateRange> (300 s TTL).
  • Currency-aware rate display using FX-snapshots published by pricing-service (fx_snapshot.updated.v1).
  • Tenant purge on tenant.deleted.v1 — full cascade purge from Postgres, OpenSearch, and cache.
  • Index rebuild from event replayPOST /api/v1/index:rebuild triggers a side-by-side reindex and atomic alias swap.
  • Analytics signalsmelmastoon.search.query.executed.v1 (sampled at 1 % anonymous + 100 % authenticated) and melmastoon.search.click.recorded.v1.

Owned aggregates (high-level)

HotelIndexEntry (denormalized property card, one per published property), RateSnapshot (cheapest-rate per (propertyId, date, currency)), AvailabilityHint (per-date roomsAvailable count, never per-room IDs), AmenityIndex (canonical amenity → property bitmap), LocationIndex (PostGIS geometry + geohash), SearchQuery (logged for analytics, anonymized after 30 d), ClickEvent, BoostRule (operator-configurable, Phase 3+), SponsoredRanking (Phase 3+), IndexBuild (control aggregate for rebuilds). Detailed model: services/search-aggregation-service/DOMAIN_MODEL.md.


Public APIs (selection)

POST /api/v1/search/queries # execute a meta-search query
GET /api/v1/search/hotels/{propertyId} # hotel detail aggregate (cross-tenant safe projection)
GET /api/v1/search/suggest # autocomplete (city, hotel name)
GET /api/v1/search/facets # available facets for current query
POST /api/v1/search/clicks # record a click-through to tenant booking
POST /api/v1/search/boost-rules # operator: create boost rule (admin)
POST /api/v1/search/boost-rules/{id}:activate
POST /api/v1/search/index:rebuild # admin: full reindex from event archive
GET /api/v1/search/index/health # index alias, doc count, freshness
GET /internal/projection/changes # internal: change stream consumed by analytics fan-out

Full contracts in services/search-aggregation-service/API_CONTRACTS.md.


Top events published

melmastoon.search.query.executed.v1 (sampled, regulated retention)
melmastoon.search.click.recorded.v1
melmastoon.search.projection.updated.v1
melmastoon.search.projection.failed.v1
melmastoon.search.boost_rule.created.v1
melmastoon.search.boost_rule.activated.v1
melmastoon.search.index.rebuilt.v1
melmastoon.search.index.health_alert.v1

Top events consumed

  • melmastoon.property.created.v1, .updated.v1, .published.v1, .unpublished.v1, .deleted.v1
  • melmastoon.property.room_type.updated.v1
  • melmastoon.property.amenity_set.updated.v1
  • melmastoon.property.photo.added.v1, melmastoon.property.photo.removed.v1
  • melmastoon.pricing.rate_plan.updated.v1, melmastoon.pricing.rate_plan.published.v1
  • melmastoon.pricing.fx_snapshot.updated.v1
  • melmastoon.inventory.allocation.confirmed.v1, .released.v1
  • melmastoon.inventory.block.created.v1, .released.v1
  • melmastoon.tenant.deleted.v1 (cascade purge)
  • melmastoon.tenant.region_changed.v1 (recompute region pinning)

Upstream / downstream

  • Upstream: property-service (catalog truth), pricing-service (rate snapshots + FX), inventory-service (per-date hints), tenant-service (residency + deletion), file-storage-service (signed URL refs for hero photos).
  • Downstream: bff-consumer-service (query + detail + suggest + clicks); analytics-service (query/click stream); ai-orchestrator-service (Phase 2+, semantic re-ranking embeddings).

Cross-tenant posture (the one place it matters)

  • Tenant isolation is inverted here: the read model is intentionally cross-tenant. Defense moves to field-level allow-listing at projection time. The projector reads upstream events for any tenant but writes only fields with cross_tenant_searchable: true. PII fields, financial ledgers, lock secrets, key credentials, and unpublished properties are never persisted in the index — the projector schema literally has no column for them.
  • Outgoing API responses are public-safe by construction. There is no "internal" search API; bff-consumer-service is the only caller.
  • A nightly cross-tenant exposure auditor scans the projection for any column that should not be there (regex against forbidden field names) and any document referencing an unpublished or deleted property; results post to security on-call.

Non-functional requirements

  • Search query p95300 ms end-to-end (cache hit ≤ 30 ms, cache miss ≤ 350 ms with OpenSearch on warm shards).
  • Hotel detail p95120 ms.
  • Projection freshness p955 s from upstream event to indexed document.
  • Availability99.95 % monthly (consumer-facing surface).
  • Pagination capped at 200 results per query (configurable per region); cursor TTL 5 min.
  • OpenSearch outage: fall back to Postgres-only search (degraded ranking, no fuzzy matching) within 1 s; emit index.health_alert.v1.
  • Cross-tenant leak: exactly 0 indexed documents may contain PII, financial, or unpublished-property fields. Audited nightly.
  • Cost cap: per-region monthly OpenSearch + Memorystore budget alarmed at 80 %.

Detailed bundle: services/search-aggregation-service/.