search-aggregation-service — SYNC_CONTRACT
Companion: SERVICE_OVERVIEW · APPLICATION_LOGIC · API_CONTRACTS · EVENT_SCHEMAS · DATA_MODEL · ../../docs/architecture/ADR-0003-electron-offline-first-desktop.md
1. Posture: NO Electron sync surface
search-aggregation-service is a cloud-only meta-search service. It is consumed by:
bff-consumer-service(web/PWA, anonymous traffic)bff-tenant-marketing-service(server-side, optional)- internal operator tooling for boost rules and index health
It is not consumed by any Electron desktop client. There is therefore no:
- offline-first replication ledger,
- LWW-with-vector-clock client diff,
melmastoon.<aggregate>.synced.v1event,- desktop-resident SQLite mirror,
sync_statetable orchange_logtable on this service.
The desktop frontoffice and backoffice clients (per ADR-0003) do not need cross-tenant search; they search within their own tenant via property-service and reservation-service. Cross-tenant search is exclusively a public, read-only, web capability.
This document exists for two reasons:
- To make the "no sync" decision explicit and auditable, so a future engineer doesn't accidentally introduce a stale local search index on the desktop client.
- To document the internal projection sync — how
search-aggregation-servicekeeps its own Postgres+OpenSearch+Redis replicas of upstream data consistent, since this is a comparable convergence problem even though it has no Electron surface.
2. Internal projection convergence (the sync that does exist here)
The service maintains three replicas of an upstream truth:
property-service / pricing-service / inventory-service / tenant-service
│
Pub/Sub (per-aggregate ordering keys)
│
▼
┌──────────────────────────────────────────┐
│ search-aggregation-service application │
│ (consumers + ProjectionAllowListPolicy) │
└────────────┬─────────────────────────────┘
│ single transaction
┌────────────┴─────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Postgres `search│ │ Outbox row │
│ .hotel_index_ │ │ (projection. │
│ entries` + … │ │ updated.v1) │
└────────┬────────┘ └────────┬────────┘
│ │
│ outbox publisher │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ OpenSearch │ │ Memorystore │
│ (mirror writer) │ │ cache invalid. │
└─────────────────┘ └─────────────────┘
2.1 Convergence guarantees
| Property | Guarantee | How |
|---|---|---|
| At-least-once consume | yes | Pub/Sub default + inbox dedup (event_id unique) |
| Per-aggregate ordering | yes | Pub/Sub ordering key = propertyId for property/pricing/inventory/projection topics |
| Out-of-order safety across producers | yes | Vector-clock guard: vc_<service> >= incoming.vc_<service> rejects stale slice |
| Atomic local commit | yes | One DB tx for inbox.processed_at + projection write + outbox row |
| OpenSearch ↔ Postgres convergence | eventual, ≤ 2 s p95 | OpenSearch writer lags Postgres outbox by one publish cycle; a recovery job replays from outbox.published_at IS NULL plus a 5-minute scan-for-drift |
| Cache ↔ Postgres convergence | best-effort, immediate or ≤ 60 s TTL | Direct invalidation on projection.updated.v1; TTL fallback if invalidator drops |
| Total drift bound | < 30 s p95 (event → result) | SLO Freshness in SERVICE_OVERVIEW.md |
2.2 Conflict resolution
Concurrent writes to a HotelIndexEntry happen when two upstream services emit events about the same propertyId simultaneously. Each consumer only writes its own slice:
| Slice | Owner consumer | Vector-clock column |
|---|---|---|
| Identity, geo, amenities, languages, hero, region, status | PropertyEventConsumer | vc_property_service |
priceFromBaseMicro, freeCancellation, payAtProperty, RateSnapshot | PricingEventConsumer | vc_pricing_service |
roomsAvailable, AvailabilityHint | InventoryEventConsumer | vc_inventory_service |
popularityScore7d/28d, boostMultiplier, freshnessBoost, qualityScore | this service (commands + jobs) | none — internal fields |
Tenant cascade (status='suppressed' then delete) | TenantEventConsumer | n/a |
There is no field that is jointly written by two consumers. Conflicts are therefore reduced to "is this incoming event newer than what I last applied for my own slice?" — answered by the vector-clock column.
When the consumer sees incoming.vectorClock < stored.vc_<service>, it:
- logs
projection.skipped_stale, - records
inbox.result = 'dropped_stale', - emits no
projection.updated.v1event.
When incoming.vectorClock > stored.vc_<service>, the slice is overwritten and the vector clock advances.
When ==, the consumer treats it as duplicate; the inbox unique constraint on event_id already short-circuits this path.
2.3 Recovery & rebuild
Three recovery mechanisms:
- Outbox publisher catch-up — runs continuously; handles transient Pub/Sub publish failures.
- Drift sweep — every 5 minutes, picks 1 000
hotel_index_entriesrows updated in the last hour and re-emits aprojection.updated.v1if the OpenSearch document hash differs. - Full reindex —
IndexBuildorchestration consumes a BigQuery archive of canonical events fromsince_ts, replays them into a fresh OpenSearch indexmelmastoon-search-v<n>-<region>, then atomically swaps themelmastoon-search-currentalias. See APPLICATION_LOGIC § StartIndexRebuild and DEPLOYMENT_TOPOLOGY § index swap runbook.
Postgres is always the canonical projection — OpenSearch and Redis can be wiped and rebuilt at any time without data loss.
3. Read-side cache contract for bff-consumer-service
Although there is no client-side mirror, bff-consumer-service may cache responses. The contract:
- Every search response carries a
Cache-Control: public, max-age=60, stale-while-revalidate=120for anonymous, non-personalized queries. - Every hotel-detail response carries
Cache-Control: public, max-age=300, stale-while-revalidate=600. - Personalized responses (with
X-User-Bucketset on a recommendation route, future) carryCache-Control: private, max-age=30. - Surrogate keys:
Surrogate-Key: hotel:<propertyId> hotel:<propertyId>:<currency>so a CDN can purge per-property onprojection.updated.v1ifbff-consumer-servicechooses to wire the webhook.
search-aggregation-service itself does not push to the CDN. It exposes GET /internal/v1/projection/changes (see API_CONTRACTS.md) so any cache layer can pull recent change keys and purge accordingly.
4. Forbidden patterns (will fail review)
- A new Electron client that mirrors
hotel_index_entriesto a local SQLite — use the existing search API instead. The desktop must not "ship a search engine" with results from other tenants; that violates the cross-tenant boundary rules even though those rows are public, because the desktop client cannot enforce the allow-list contract over time. - Bidirectional sync (mobile or desktop write back to the index). All writes are event-driven; there is no client-write API.
- A new "sync_state" table for any external client — every reader uses the search API.
- Direct Postgres reads from
bff-consumer-serviceagainstsearch.*. Reads must go through the public REST API to preserve degradation, caching, ranking, and rate-limit semantics.
5. Versioning
This service participates in server-side schema versioning only:
| Versioned thing | How it's versioned | Backward compat |
|---|---|---|
| Public REST API | URL /api/v<N>/… (see API_CONTRACTS.md) | Two majors at once min. 90 d |
| Event payloads | event.version integer + topic suffix .v<n> | Topic-major break ⇒ new topic; producers dual-publish ≥ 30 d |
| OpenSearch index template | melmastoon-search-v<n>-<region> index, one alias | Atomic alias swap, instant rollback |
| Postgres schema | expand → backfill → contract, see MIGRATION_PLAN.md | Old code keeps running through expand & backfill phases |
There is no client-side schema to coordinate with; rollouts are pure server orchestration.