search-aggregation-service — SERVICE_READINESS
Companion: SERVICE_OVERVIEW · SECURITY_MODEL · OBSERVABILITY · DEPLOYMENT_TOPOLOGY · TESTING_STRATEGY · FAILURE_MODES · SERVICE_RISK_REGISTER · ../../docs/standards/DEFINITION_OF_DONE.md
The readiness checklist is gated by the platform readiness review board (one platform owner + one security reviewer + one SRE). All ✓ items must be ticked and evidence linked before a region launch.
1. Functional readiness
| # | Check | Owner | Status |
|---|---|---|---|
| F-1 | All 17 service-bundle docs published and current | service owner | ✅ |
| F-2 | Domain layer complete: HotelIndexEntry, RateSnapshot, AvailabilityHint, BoostRule, IndexBuild, with all invariants enforced | service owner | ✅ |
| F-3 | All commanded use cases implemented (per APPLICATION_LOGIC § Commands) | service owner | ✅ |
| F-4 | All consumed events have a registered handler (per EVENT_SCHEMAS § Consumed events) | service owner | ✅ |
| F-5 | All published events validate against contracts/asyncapi.yaml in CI | service owner | ✅ |
| F-6 | Public REST API matches contracts/openapi.yaml; CI fails on diff | service owner | ✅ |
| F-7 | Multi-language search verified for ps, fa, tg, ar, ur, en, ru (per region) | search domain expert | ⏳ launch region only |
| F-8 | Geo search verified at boundaries: 0 km radius, 200 km cap, antimeridian (n/a Phase 1), bbox > 250 000 km² rejected | service owner | ✅ |
| F-9 | Currency conversion verified against pricing-service golden FX snapshot | pricing domain expert | ✅ |
| F-10 | Region pinning enforced; cross-region requests return only target-region results when strict=true | service owner | ✅ |
| F-11 | Click recording → popularity recompute closes the loop (24 h cycle) | service owner | ✅ |
| F-12 | Boost-rule lifecycle (draft → active → expired/cancelled) covered with operator UI smoke | platform UX | ⏳ Phase 3 |
| F-13 | Index rebuild from BigQuery archive completes for the launch region within 4 h | SRE | ✅ rehearsed in staging |
2. Operational readiness
| # | Check | Owner | Status |
|---|---|---|---|
| O-1 | Cloud Run service deployed in primary + secondary region | SRE | ✅ |
| O-2 | Cloud SQL HA enabled + cross-region read replica | SRE | ✅ |
| O-3 | OpenSearch cluster (Aiven) sized per DEPLOYMENT_TOPOLOGY § 1 per region | SRE | ✅ |
| O-4 | Memorystore Redis HA in each region | SRE | ✅ |
| O-5 | Pub/Sub topics + subscriptions provisioned via Terraform; DLQ + retry policy applied | SRE | ✅ |
| O-6 | All dashboards live (Cloud Monitoring + Grafana mirror) | SRE | ✅ |
| O-7 | All SLO burn-rate alerts wired to PagerDuty (search-aggregation rotation) | SRE | ✅ |
| O-8 | All runbooks present in ops/runbooks/search-aggregation-service/ | service owner | ✅ |
| O-9 | DR game day rehearsed within last 90 days | SRE | ⏳ first launch will count |
| O-10 | Synthetic checks running in EU + ASIA | SRE | ✅ |
| O-11 | Cost guardrails configured (Cloud Run, Pub/Sub, OpenSearch storage) with monthly review | platform finance | ✅ |
| O-12 | On-call rotation populated (primary + secondary, follow-the-sun) | engineering manager | ✅ |
| O-13 | /healthz and /readyz differentiated and used by Cloud Run probes | service owner | ✅ |
| O-14 | Outbox publisher advisory lock prevents duplicate publishing across pods (verified via integration test) | service owner | ✅ |
| O-15 | Index swap rehearsed in staging twice without incident | service owner | ✅ |
| O-16 | Tenant cascade purge rehearsed for a synthetic deleted tenant in staging within SLO | security + SRE | ✅ |
3. Security & compliance readiness
| # | Check | Owner | Status |
|---|---|---|---|
| S-1 | Field-level allow-list policy implemented and CI-enforced (L1 type, L2 projection, L3 schema, L4 audit) | service owner + security | ✅ |
| S-2 | tenant-isolation.spec.ts (inverted) green | service owner | ✅ |
| S-3 | outbox.spec.ts and inbox.spec.ts green | service owner | ✅ |
| S-4 | Cross-tenant exposure auditor scheduled nightly | security | ✅ |
| S-5 | Cursor signing key rotation procedure documented + dry-run | security | ✅ |
| S-6 | Cloud Armor WAF + per-IP rate limit configured per SECURITY_MODEL § 7 | security | ✅ |
| S-7 | OPA policies for admin routes deployed and tested | security | ✅ |
| S-8 | All secrets in Google Secret Manager with rotation schedule | security | ✅ |
| S-9 | Container image signed (cosign) and Cloud Run admission policy enforces signature | security | ✅ |
| S-10 | SBOM published and CVE scan green (no critical/high) | security | ✅ |
| S-11 | Audit topics mirrored to audit-service BigQuery + GCS Object Lock | security | ✅ |
| S-12 | Pentest scope items listed in SECURITY_MODEL § 13 covered (annual, due before next launch) | security | ⏳ scheduled |
| S-13 | Privacy review: search_queries retention + anonymization approved | DPO / legal | ✅ |
| S-14 | Sanctions list ingestion path validated (via tenant-service suspension flow) | compliance | ✅ |
| S-15 | All AI calls go through ai-orchestrator-service; direct provider SDKs absent from dependency tree | security | ✅ |
4. Performance readiness
| # | Check | Owner | Status |
|---|---|---|---|
| P-1 | Search latency p95 < 250 ms at 1 000 RPS (k6 sustained 10 min) | performance | ✅ |
| P-2 | Hotel detail latency p95 < 200 ms at 500 RPS | performance | ✅ |
| P-3 | Cache hit ratio ≥ 70 % under load profile | performance | ✅ |
| P-4 | Projection freshness p95 < 30 s under 50 events/sec ingestion | performance | ✅ |
| P-5 | Outbox publish lag p95 < 1 s steady state | performance | ✅ |
| P-6 | OpenSearch shard rejection rate < 0.1 % under peak | performance | ✅ |
| P-7 | Postgres connection pool saturation < 70 % under peak | performance | ✅ |
| P-8 | Memorystore eviction rate < 5 / min under peak | performance | ✅ |
| P-9 | Performance baseline pinned in BigQuery; nightly regression check active | performance | ✅ |
5. Documentation & onboarding
| # | Check | Owner | Status |
|---|---|---|---|
| D-1 | This bundle present and reviewed | tech writer + service owner | ✅ |
| D-2 | OpenAPI doc rendered via melmastoon-docs-portal; consumer-visible endpoints documented | tech writer | ✅ |
| D-3 | AsyncAPI doc rendered + linked from 04 Event Architecture | tech writer | ✅ |
| D-4 | LOCAL_DEV_SETUP works for a brand-new dev (verified by 2 engineers outside the team within 30 days) | service owner | ✅ |
| D-5 | Runbooks tested by an on-call engineer not on the team | SRE | ✅ for top 5 |
| D-6 | Public-facing changelog page set up | service owner | ⏳ at GA |
6. Definition of Ready (per launch region)
Before turning on a new region (e.g. IR after AF+TJ):
-
province_centersseeded for the region with verified geo + city slugs. - OpenSearch index
melmastoon-search-v<n>-<region>created and joined to the writer alias. - At least 100 published properties projected from upstream services.
- Multi-language acceptance test corpus updated for the region's primary languages.
- Synthetic checks added for the region.
- DR game day rerun including the new region's resources.
- FX snapshot supports the region's currency pair.
- Pentest delta scope reviewed.
- Region appears in
flags-serviceallow-listsearch.region_pinning.allowed_regions.
7. Definition of Done (per release)
Per DEFINITION_OF_DONE.md, every release must show:
- All CI gates green (unit, integration, contract, perf smoke, lint, typecheck, security).
- Coverage gate green (domain ≥ 95 %, application ≥ 90 %, overall ≥ 85 %).
- Migration plan approved (if schema changes) + dry-run successful.
- OpenAPI/AsyncAPI/OpenSearch template diffs reviewed and approved.
- Allow-list audit passed in CI.
- No new critical/high CVEs.
- Image signed; admission policy passes.
- Progressive rollout completed without SLO burn alarm.
- Release notes appended to changelog.
- Post-deploy smoke + synthetics green for 30 min.
- On-call notified.
8. Sign-offs (template)
| Role | Name | Date | Notes |
|---|---|---|---|
| Service owner | … | … | |
| Platform owner | … | … | |
| Security reviewer | … | … | |
| SRE on-call lead | … | … | |
| DPO / legal (privacy) | … | … | |
| Compliance | … | … | sanctions / data residency |
| Engineering manager | … | … |
A single ❌ above blocks region launch.