Readiness
:::info Source
Sourced from services/search-service/SERVICE_READINESS.md in the documentation repo.
:::
Readiness framework inherited from docs/06-traceability-matrix.md. Levels L0–L4.
1. Readiness Levels
| Level | Meaning |
|---|---|
| L0 | Placeholder — API schema only |
| L1 | Walking skeleton — happy path works end-to-end for one scenario |
| L2 | Alpha — feature-complete for one tenant, single region, no AI ranker |
| L3 | Beta — multi-tenant, multi-region, hybrid search, basic recs |
| L4 | Production — SLOs met, chaos-validated, AI ranker + L2R trained, prefetch |
2. Milestone Trajectory
| Milestone | Target date | Target level | Scope |
|---|---|---|---|
| M0 | Q2-Y1 | — | contracts + schemas drafted |
| M1 | Q3-Y1 | L1 | skeleton deployable, single tenant, lexical over courses |
| M2 | Q4-Y1 | L2 | multi-tenant, taxonomy/facet filters, autocomplete |
| M3 | Q1-Y2 | L3 | semantic embeddings live, hybrid + basic recs, reindex |
| M4 | Q2-Y2 | L4 | full hybrid + L2R, offline prefetch, SLOs green |
| M5 | Q3-Y2 | L4+ | AI query expansion, multi-locale embeddings, on-device fallback |
3. L1 Entry Criteria
- Deployable to dev cluster.
-
/healthz+/readyzrespond correctly. - One projector wired (catalog
course_version.published.v1). - One endpoint working end-to-end (
GET /search?q=...&type=course). - OpenTelemetry traces visible in Jaeger.
- Unit test coverage ≥ 60%.
- Service registered in service catalog.
4. L2 Entry Criteria
Extends L1 with:
- All 7 document types indexed.
- Taxonomy + facet filters.
- Autocomplete endpoint.
- Per-tenant index policy table + seed.
- Tenant isolation tests pass (negative tests included).
- Postgres schema frozen with migration story.
- CI runs contract tests against upstream services.
- Coverage ≥ 75%.
5. L3 Entry Criteria
Extends L2 with:
- pgvector integration via ai-gateway.
- Hybrid query with RRF merge (no L2R yet).
- Basic recommendations (kNN only).
- Reindex command working end-to-end.
- Reindex completion event published.
- DLQ + alerts hooked.
- Two regions live (US + EU).
- Chaos drill F1, F5, F7 executed.
- Coverage ≥ 80%.
6. L4 Entry Criteria
Extends L3 with:
- L2R model trained and deployed.
- NDCG@10 ≥ 0.72 on golden set.
- Offline prefetch bundle served via sync-service.
- All SLOs (query p95, indexing lag, availability) sustainably met over 30d.
- Chaos drills F1–F10 rehearsed.
- Full audit trail validated by compliance.
- GDPR erasure flow validated end-to-end.
- Pen test passed with no HIGH findings.
- Runbooks reviewed and dated within last quarter.
- Coverage ≥ 80%; ranking pkg ≥ 90%.
7. L4+ Backlog (M5)
- LLM-based query expansion (opt-in).
- Multi-locale embeddings with cross-lingual retrieval.
- On-device semantic fallback (distilled model).
- Learn-to-rank retraining loop fully automated.
- Cross-region marketplace discovery.
8. Operational Readiness Gates
Shared with platform readiness review.
| Gate | Check |
|---|---|
| SLO dashboard | exists, on-call has access |
| Alert routing | paged + ticket paths configured in PagerDuty |
| Runbook index | FAILURE_MODES.md reviewed this quarter |
| DR drill | quarterly; last pass ≤ 90 days |
| On-call primary + secondary | rotations filled for next 90d |
| Change budget | error budget burn < 50% rolling 30d |
| Dependency map | service catalog up to date |
| Capacity plan | CPU, memory, cost projected through next 90d |
9. Slice Tracking
Cross-ref SERVICE_OVERVIEW.md §6:
| Slice | Phase | Status |
|---|---|---|
| S2 Basic full-text | M2 | planned |
| S4 Taxonomy filters + autocomplete | M3 | planned |
| S5 Hybrid + recs | M4 | planned |
| S6 On-device semantic fallback | M5 | optional |
Track status in the program's ADO board under epic:search-readiness.
10. Definition of Done (per feature)
- Unit + integration + contract tests green.
- Docs updated (this folder +
/docs/03-microservices/search-service.mdif cross-cutting). - Telemetry: spans + metrics + logs defined.
- Error taxonomy extended if new failure class.
- Feature flag wired (default off) for rollouts.
- Security review signed off when touching auth/PII/embedding.
- Runbook entry added when new failure mode.
11. Readiness Review Cadence
| Review | Frequency | Attendees |
|---|---|---|
| Engineering readiness | biweekly | search team + SRE |
| Cross-service integration | monthly | search + catalog + authoring + ai-gateway |
| Compliance / security | quarterly | security + DPO + platform |
| Executive | per milestone | eng leadership + PM |
12. Exit Criteria (sunset)
Not planned within horizon. If ever, minimum:
- 12-month deprecation notice to consumers.
- Clients offered alternative discovery mechanism.
- Cold indices archived to object storage with retrieval plan.
- Events in SEARCH stream migrated / retired cleanly.