Skip to main content

Sync Contract

:::info Source Sourced from services/search-service/SYNC_CONTRACT.md in the documentation repo. :::

Inherits sync protocol from docs/03-microservices/sync-service.md. This doc specifies only the search-specific behavior.

Search results are mostly read-only on the client. The main sync concerns:

  1. Prefetching search documents needed for offline viewing.
  2. Caching recent queries + recommendations for offline re-execution.
  3. Recording feedback events while offline.

1. What Syncs

ResourceDirectionPriorityNotes
Recently viewed SearchableDocument sliceserver→clienthighLRU cap: 500 docs/user
Last N recommendationsserver→clienthighmost recent 3 contexts × 20 items
Query history (user's own)server→client + client→servermedfor autocomplete learning
Feedback events (click/dismiss/convert)client→serverhighretried on reconnect
Embedding micro-modelserver→clientlowonly M5+; optional
Recent suggestions dictionary (top-K by tenant)server→clientlow3k terms per tenant, 50KB

Search-service does not sync the full index to clients.

2. Client Data Shape

On the client, search state is represented as:

interface OfflineSearchStore {
documents: Map<DocId, SearchableDocumentLite>; // up to 500
recommendations: Map<Context, RecommendationSnapshot>;
queryHistory: QueryHistoryItem[]; // last 100
pendingFeedback: FeedbackEvent[]; // retry queue
suggestDict: TrieIndex; // top-K tenant terms
lastSyncAt: ISODate;
localVersion: number;
}

SearchableDocumentLite drops embedding and heavy body content; keeps title, summary, tags, score, updatedAt.

3. Protocol

Clients use sync-service's generic /sync/pull and /sync/push endpoints. search-service exposes the following sync bundle descriptors:

3.1 Pull bundle search:user-prefetch

Triggered at app start and every 30m.

Request:

{
"bundles": ["search:user-prefetch"],
"since": "2026-04-15T08:00:00Z",
"clientVersion": 42
}

Response (within 30s):

{
"data": {
"search:user-prefetch": {
"documents": [ { "id": "...", "title": {...}, "updatedAt": "..." } ],
"recommendations": { "home": {...}, "next-step": {...} },
"suggestDict": { "version": 12, "checksum": "sha256-...", "terms": [...] }
}
},
"meta": {
"cursor": "eyJzIjoxMH0",
"nextSyncAfter": "2026-04-15T08:30:00Z"
}
}

3.2 Push bundle search:feedback

{
"bundle": "search:feedback",
"clientMutationId": "01HAF...",
"events": [
{
"localId": "local_01HAF...",
"generationId": "01HAF...",
"itemId": "catalog:course:...",
"action": "click",
"position": 3,
"recordedAt": "2026-04-15T08:05:00Z"
}
]
}

Server response:

{
"data": {
"applied": [
{ "localId": "local_01HAF...", "serverId": "fbk_01HAG..." }
],
"rejected": []
}
}

4. Conflict Resolution

  • Documents: server wins always. Client cannot edit.
  • Recommendations: server wins; client discards local copy on pull.
  • Query history: union merge (both sides keep all); server is source of truth for cross-device view.
  • Feedback events: last-write-wins by recordedAt; duplicates filtered by localId.

5. Offline Query Semantics

When offline, a query q is answered as follows:

M5 optional semantic fallback: a ~30MB distilled embedding model runs on-device, embedding the query and doing cosine over cached doc embeddings.

Offline results always include meta.offline: true so the UI can badge them.

6. Bundle Eviction

Per-user bundle store has a soft cap (100 MB default). Eviction is LRU, but:

  • Pinned docs (user bookmarked / currently enrolled course pages) never evicted.
  • Certificates belonging to the user never evicted.
  • Recommendations replaced wholesale on every pull.

7. Security of Offline Data

  • Bundle payloads are encrypted at rest on-device (SQLCipher / IndexedDB + WebCrypto).
  • visibility = private documents require device binding: encrypted with key derived from device keystore + user session.
  • On logout → entire offline search store wiped.
  • On tenant.membership_deactivated.v1 → server publishes sync.bundle.revoked.v1 → client wipes the bundle.

8. Bandwidth Budget

BundleInitial downloadDelta (per 30m)Compression
user-prefetch≤ 2 MB≤ 200 KBgzip + binary msgpack
suggestDict≤ 50 KB≤ 5 KBdelta encoding

Large catalog tenants: bundle scoped to the user's accessible surface only.

9. Feedback Retry Semantics

  • Client stores feedback events in a durable queue.
  • On reconnect, flushes in batches of 50.
  • Server is idempotent by localId.
  • After 7 days queued, events are dropped with a local warning telemetry.

10. Checkpointing

Clients track lastSyncAt + cursor. Server rejects pull with 409 SYNC_CURSOR_INVALID if cursor predates retention window; client falls back to a full bundle pull.

11. Test Vectors

See TESTING_STRATEGY.md §6 for sync E2E test scenarios, including airplane-mode transition, device switch, logout, and tenant removal cases.