SERVICE_READINESS — bff-tenant-booking-service
Sibling: SERVICE_OVERVIEW · SERVICE_RISK_REGISTER · DEPLOYMENT_TOPOLOGY · TESTING_STRATEGY
Cross-cutting: Standards · DEFINITION_OF_DONE · Standards · SERVICE_TEMPLATE
This is the production-readiness gate for bff-tenant-booking-service. Every checkbox must be green before prod traffic. Owned by Frontend Platform tech lead + SRE on-call. A signed copy is filed in services/bff-tenant-booking-service/_readiness/<release>.md.
1. Documentation completeness
- All 17 specs in this folder complete (no
TBD). - 03-microservices/bff-tenant-booking-service.md up to date.
- OpenAPI generated and committed.
- Event schemas registered in
@ghasi/event-envelope/schemas/bff-tenant/. - All ADRs that affect this BFF linked from SERVICE_OVERVIEW.
2. Code quality
-
pnpm lintclean. -
pnpm typecheckclean. - No
anyoutside justified// allow-any. - No
as unknown asoutside test code.
3. Test coverage
- Unit ≥ 90% statements / 85% branches.
- Critical-file coverage 100% (state machine, HandoffVerifier, payment-return, hold, idempotency, single-flight).
- Integration tests pass against ephemeral stack.
- Mandatory
tenant-isolation.spec.tspasses. - Mandatory
outbox.spec.tsandinbox.spec.tspass. - Pact consumer pacts published; verification reports green for all upstreams.
- Pact provider pact verified for
bff-consumer-service's consumer. - Stryker mutation score ≥ 75% on critical files.
- Playwright E2E nightly green for: happy path, handoff arrival, abandonment, cash-on-arrival, RTL switch, currency switch, custom domain, payment-return idempotency, handoff replay rejection.
4. Performance
- k6 steady-state passes (p95 < 600 ms; error < 0.1%).
- k6 flash-sale passes (p95 < 1 s warm; cache hit > 90%).
- k6 booking burst passes (confirm p95 < 1.5 s; success > 99%).
- Long-soak passes 8 h.
-
/bootstrapp95 < 350 ms warm. -
/confirmcorrectness: zero double-confirms in chaos drill.
5. Observability
- All SLIs emitting; SLOs declared.
- Dashboards: Funnel health, Service SLO, Booking flow, Per-tenant.
- Alerts have ack'd runbooks.
- Trace-tag coverage verified (tenant.id, route, draft.id, handoff.id, payment.intent.id).
- Log fields verified.
- PII filter verified (no raw email/phone/name in logs or telemetry).
6. Security
- Threat model reviewed (
SECURITY_MODEL §14). - Secrets in Secret Manager only.
- HMAC key rotation drill done in stage in last 90 days; previous-key 7-day overlap honored.
- Cloud Armor active; bot rules enabled.
- reCAPTCHA Enterprise verified end-to-end.
- DAST report has zero high/critical.
-
pnpm auditclean. - Trivy scan clean.
- Cosign signature verified by binary authorization.
- Cookie attributes verified (
HttpOnly; Secure; SameSite=Lax; Path=/api). - CORS allow-list verified for prod consumer-tenant origins.
- CSP nonce uniqueness verified under load.
- Penetration test signed off in last 12 months.
- Tenant-isolation tests pass at all layers.
7. Reliability
- Cloud Run min instances = 3 / region.
- Multi-region: primary
asia-south1, DR-warmeurope-west4. - DR drill executed in stage in last 90 days; RTO ≤ 30 min.
- Circuit breakers configured for every upstream.
- Per-route deadline + retry policy reviewed.
- Two Memorystore tiers (cache + session) configured with HA.
- Cloud SQL HA + cross-region read replica.
- Custom-domain TLS auto-renewal verified for canary tenant.
8. Release process
- CI: lint, typecheck, unit, integration, contract, build, scan, sign, deploy-dev, smoke.
- Canary 5% / 25% / 100% with metric guardrails.
- Rollback budget ≤ 5 min.
- Feature flags documented; default off.
- Release notes drafted.
9. Operations
- On-call rotation assigned (Frontend Platform).
- PagerDuty escalation policy verified.
- Runbooks present for: F-1, F-6, F-7, F-9, F-15, F-16, F-26, F-28 (per FAILURE_MODES).
- Cost dashboard with budget alerts at 50/80/100/120%.
- On-call handoff doc in this folder.
- Backup + restore tested for Cloud SQL.
10. Compliance / data governance
- PII inventory in SECURITY_MODEL §11 reviewed by data steward.
- DPIA filed for booking-time PII collection.
- Cookie consent flow integrated with tenant booking app (no telemetry until consent in EU).
- Data retention enforced:
BookingDraftSnapshot30 d,handoff_arrival_log30 d,idempotency24 h. - Sharia-compliant tenants flagged with
complianceProfile; AI suggestions filtered accordingly.
11. Sign-off
| Role | Name | Date | Signature |
|---|---|---|---|
| Service tech lead (Frontend Platform) | |||
| SRE on-call (rotating) | |||
| Security reviewer | |||
| Data steward | |||
| Product manager (booking flow owner) | |||
| Eng manager / Director |
A snapshot of this checklist is committed to services/bff-tenant-booking-service/_readiness/<release-tag>.md.