Skip to main content

Readiness

:::info Source Sourced from services/search-service/SERVICE_READINESS.md in the documentation repo. :::

Readiness framework inherited from docs/06-traceability-matrix.md. Levels L0–L4.

1. Readiness Levels

LevelMeaning
L0Placeholder — API schema only
L1Walking skeleton — happy path works end-to-end for one scenario
L2Alpha — feature-complete for one tenant, single region, no AI ranker
L3Beta — multi-tenant, multi-region, hybrid search, basic recs
L4Production — SLOs met, chaos-validated, AI ranker + L2R trained, prefetch

2. Milestone Trajectory

MilestoneTarget dateTarget levelScope
M0Q2-Y1contracts + schemas drafted
M1Q3-Y1L1skeleton deployable, single tenant, lexical over courses
M2Q4-Y1L2multi-tenant, taxonomy/facet filters, autocomplete
M3Q1-Y2L3semantic embeddings live, hybrid + basic recs, reindex
M4Q2-Y2L4full hybrid + L2R, offline prefetch, SLOs green
M5Q3-Y2L4+AI query expansion, multi-locale embeddings, on-device fallback

3. L1 Entry Criteria

  • Deployable to dev cluster.
  • /healthz + /readyz respond correctly.
  • One projector wired (catalog course_version.published.v1).
  • One endpoint working end-to-end (GET /search?q=...&type=course).
  • OpenTelemetry traces visible in Jaeger.
  • Unit test coverage ≥ 60%.
  • Service registered in service catalog.

4. L2 Entry Criteria

Extends L1 with:

  • All 7 document types indexed.
  • Taxonomy + facet filters.
  • Autocomplete endpoint.
  • Per-tenant index policy table + seed.
  • Tenant isolation tests pass (negative tests included).
  • Postgres schema frozen with migration story.
  • CI runs contract tests against upstream services.
  • Coverage ≥ 75%.

5. L3 Entry Criteria

Extends L2 with:

  • pgvector integration via ai-gateway.
  • Hybrid query with RRF merge (no L2R yet).
  • Basic recommendations (kNN only).
  • Reindex command working end-to-end.
  • Reindex completion event published.
  • DLQ + alerts hooked.
  • Two regions live (US + EU).
  • Chaos drill F1, F5, F7 executed.
  • Coverage ≥ 80%.

6. L4 Entry Criteria

Extends L3 with:

  • L2R model trained and deployed.
  • NDCG@10 ≥ 0.72 on golden set.
  • Offline prefetch bundle served via sync-service.
  • All SLOs (query p95, indexing lag, availability) sustainably met over 30d.
  • Chaos drills F1–F10 rehearsed.
  • Full audit trail validated by compliance.
  • GDPR erasure flow validated end-to-end.
  • Pen test passed with no HIGH findings.
  • Runbooks reviewed and dated within last quarter.
  • Coverage ≥ 80%; ranking pkg ≥ 90%.

7. L4+ Backlog (M5)

  • LLM-based query expansion (opt-in).
  • Multi-locale embeddings with cross-lingual retrieval.
  • On-device semantic fallback (distilled model).
  • Learn-to-rank retraining loop fully automated.
  • Cross-region marketplace discovery.

8. Operational Readiness Gates

Shared with platform readiness review.

GateCheck
SLO dashboardexists, on-call has access
Alert routingpaged + ticket paths configured in PagerDuty
Runbook indexFAILURE_MODES.md reviewed this quarter
DR drillquarterly; last pass ≤ 90 days
On-call primary + secondaryrotations filled for next 90d
Change budgeterror budget burn < 50% rolling 30d
Dependency mapservice catalog up to date
Capacity planCPU, memory, cost projected through next 90d

9. Slice Tracking

Cross-ref SERVICE_OVERVIEW.md §6:

SlicePhaseStatus
S2 Basic full-textM2planned
S4 Taxonomy filters + autocompleteM3planned
S5 Hybrid + recsM4planned
S6 On-device semantic fallbackM5optional

Track status in the program's ADO board under epic:search-readiness.

10. Definition of Done (per feature)

  • Unit + integration + contract tests green.
  • Docs updated (this folder + /docs/03-microservices/search-service.md if cross-cutting).
  • Telemetry: spans + metrics + logs defined.
  • Error taxonomy extended if new failure class.
  • Feature flag wired (default off) for rollouts.
  • Security review signed off when touching auth/PII/embedding.
  • Runbook entry added when new failure mode.

11. Readiness Review Cadence

ReviewFrequencyAttendees
Engineering readinessbiweeklysearch team + SRE
Cross-service integrationmonthlysearch + catalog + authoring + ai-gateway
Compliance / securityquarterlysecurity + DPO + platform
Executiveper milestoneeng leadership + PM

12. Exit Criteria (sunset)

Not planned within horizon. If ever, minimum:

  • 12-month deprecation notice to consumers.
  • Clients offered alternative discovery mechanism.
  • Cold indices archived to object storage with retrieval plan.
  • Events in SEARCH stream migrated / retired cleanly.