SERVICE_READINESS — reservation-service
Sibling: SERVICE_RISK_REGISTER · TESTING_STRATEGY · OBSERVABILITY
Strategic anchor: standards/SERVICE_TEMPLATE §Service readiness gate · standards/DEFINITION_OF_DONE
A service is ready for production only when every box below is green. Tech lead and SRE both sign off; the tenant-pilot launch is gated on this checklist plus a successful 30-minute, 5%-traffic canary in staging.
1. Documentation completeness
- All 17 bundle files present and substantive (no stub headings).
- SERVICE_OVERVIEW reviewed by domain product owner.
- DOMAIN_MODEL, API_CONTRACTS, EVENT_SCHEMAS reviewed by every consuming service team (
inventory,pricing,payment-gateway,lock-integration,notification,billing,housekeeping,analytics,audit,search-aggregation,sync, both BFFs). - SECURITY_MODEL reviewed by Security and signed off.
- SYNC_CONTRACT reviewed by sync/desktop team and tested against
sync-service. - AI_INTEGRATION reviewed by
ai-orchestrator-serviceteam; AIProvenance fields persisted on every AI-touched value. - FAILURE_MODES entries each have a runbook in
runbooks/reservation/. - 03-microservices catalog summary up to date.
2. Code quality
- ESLint domain layer import-restriction passes (no NestJS, no Drizzle, no I/O imports under
src/domain/). -
tsc --noEmitstrict, zero errors. - No
any, no// @ts-ignore, no// eslint-disableoutside reviewed adapter shims. - Public API surface (controllers, ports, DTOs) has TSDoc comments.
- Conventional commit history clean.
3. Test coverage and gates
- Overall coverage ≥ 85% statements; ≥ 80% branches.
- Domain layer coverage ≥ 95% statements; 100% on the state machine.
- All unit tests in TESTING_STRATEGY §2 and §3 green.
- Mandatory three integration tests pass on every PR:
tenant-isolation.spec.ts,outbox.spec.ts,inbox.spec.ts. - Booking saga happy path + 8 compensation paths (C1–C8) all green.
- Concurrency suite green (check-in race, cancel-vs-modify, group partial cancel, hold-expiry race).
- FX snapshot stability test green (including IRR magnitude).
- Cash-on-arrival flow test green.
- Pact contracts published; consumer + provider sides verified by Pact broker.
- OpenAPI snapshot diff: no breaking changes without major version bump.
- Event schema registry conformance green for every produced subject.
4. API and event hygiene
- All endpoints under
/api/v1/reservations/*; URI versioned only. - All mutating endpoints accept
Idempotency-Keyand dedupe over 24 h. - All endpoints support
If-Match: "v<n>"for OCC. - All endpoints emit Problem+JSON errors with canonical
MELMASTOON.RESERVATION.*codes. - All produced subjects follow
melmastoon.reservation.<aggregate>.<verb-past-tense>.v1. - Every consumed subject has an inbox handler with dedupe and a DLQ binding.
- Every event envelope carries
tenantId,traceparent,causationIdwhere applicable.
5. Storage and migrations
- All tables under
reservationschema havetenant_id, an RLS policy<table>_tenant_isolation, and a leadingtenant_idindex. - No table has cross-tenant foreign keys.
- Append-only audit (
reservation_modifications) has no UPDATE/DELETE grants to the application role. - Outbox + inbox dedupe tables present.
- Migrations are backwards-compatible; no destructive change in the same release as a writer removal.
- Postgres connection middleware sets
app.tenant_idper request and is covered bytenant-isolation.spec.ts.
6. Security
- No payment processor or lock vendor SDKs imported (CI dependency-graph guard).
- Field-level encryption applied to
guests.email,guests.phone_e164with per-tenant DEK. - Hash-for-search (
email_hash,phone_e164_hash) populated on every write. - All secrets via Secret Manager + Workload Identity; no SA keys in deploy artifacts.
- Security review (security-reviewer) signed off.
- Threat model entries in SECURITY_MODEL §9 all have mitigations implemented.
7. Observability
- OpenTelemetry initialized before NestFactory in
main.ts(verified by smoke test). -
tenant_id,trace_id,request_idpresent on every log record (verified by structured-log lint). - All SLIs in OBSERVABILITY §3 emit metrics with the documented tags.
- Three dashboards (service health, booking funnel, operations) present and reviewed.
- All alerts in OBSERVABILITY §6 configured with paged routes and named runbooks.
- Synthetic checks live (
POST /quotes,POST /holds → confirm,GET /internal/health).
8. Deployment
- Cloud Run service deployed to
me-central1(primary) andasia-south1(secondary if region-pinned tenants exist). - Hold-expiry worker deployed as separate Cloud Run service with single-replica pin and Cloud Scheduler trigger every 30 s.
- Min 3 replicas on the API service; HPA configured.
- VPC Service Controls perimeter membership confirmed.
- Workload Identity mappings verified for both services.
- Canary deploy: 5% / 30 min in staging completed without alert ladder firing.
- Rollback rehearsed and recorded.
9. Desktop / sync
-
sync-serviceintegration test exercises pull and push forreservation,reservation_item,guest,additional_guest,special_request,reservation_modification. - Conflict-policy table in SYNC_CONTRACT §2 implemented end-to-end and verified by an offline-then-reconnect drill.
- Walk-in offline path tested (client-issued
rsv_d_ID → server canonicalrsv_ID mapping).
10. AI
- Every AI capability documented in AI_INTEGRATION.
- AIProvenance persisted on every AI-derived value.
- HITL flow for
auto_blockanomaly verdicts implemented; audit-traceable. - Fallbacks (cloud failure → no-op or edge) demonstrated under chaos test.
11. Operational
- On-call rotation assigned; PagerDuty escalations configured.
- Tenant-pilot success criteria recorded; first-tenant rollback plan written.
- Cost dashboards (Cloud Run, Cloud SQL share, Pub/Sub egress, KMS, AI orchestrator spend) include
service=reservationfilter. - Cost guardrails set (budget alerts at 50/75/100/110% of monthly target).
- Backup / point-in-time restore tested for the
reservationschema (RPO 5 min, RTO 30 min).
12. Sign-off
| Role | Name | Date |
|---|---|---|
| Tech lead (PMS core) | _________ | _________ |
| SRE on-call lead | _________ | _________ |
| Security reviewer | _________ | _________ |
| Domain product owner | _________ | _________ |
| AI orchestrator team | _________ | _________ |
| Sync/desktop team | _________ | _________ |
A green checklist plus all six signatures unlocks prod-ready status. Without all of them, traffic stays at 0% in production.
13. Cross-references
- Definition of Done: standards/DEFINITION_OF_DONE
- Service template gate: standards/SERVICE_TEMPLATE
- Risk register: SERVICE_RISK_REGISTER