Skip to main content

SERVICE_READINESS — bff-tenant-booking-service

Sibling: SERVICE_OVERVIEW · SERVICE_RISK_REGISTER · DEPLOYMENT_TOPOLOGY · TESTING_STRATEGY

Cross-cutting: Standards · DEFINITION_OF_DONE · Standards · SERVICE_TEMPLATE

This is the production-readiness gate for bff-tenant-booking-service. Every checkbox must be green before prod traffic. Owned by Frontend Platform tech lead + SRE on-call. A signed copy is filed in services/bff-tenant-booking-service/_readiness/<release>.md.

1. Documentation completeness

  • All 17 specs in this folder complete (no TBD).
  • 03-microservices/bff-tenant-booking-service.md up to date.
  • OpenAPI generated and committed.
  • Event schemas registered in @ghasi/event-envelope/schemas/bff-tenant/.
  • All ADRs that affect this BFF linked from SERVICE_OVERVIEW.

2. Code quality

  • pnpm lint clean.
  • pnpm typecheck clean.
  • No any outside justified // allow-any.
  • No as unknown as outside test code.

3. Test coverage

  • Unit ≥ 90% statements / 85% branches.
  • Critical-file coverage 100% (state machine, HandoffVerifier, payment-return, hold, idempotency, single-flight).
  • Integration tests pass against ephemeral stack.
  • Mandatory tenant-isolation.spec.ts passes.
  • Mandatory outbox.spec.ts and inbox.spec.ts pass.
  • Pact consumer pacts published; verification reports green for all upstreams.
  • Pact provider pact verified for bff-consumer-service's consumer.
  • Stryker mutation score ≥ 75% on critical files.
  • Playwright E2E nightly green for: happy path, handoff arrival, abandonment, cash-on-arrival, RTL switch, currency switch, custom domain, payment-return idempotency, handoff replay rejection.

4. Performance

  • k6 steady-state passes (p95 < 600 ms; error < 0.1%).
  • k6 flash-sale passes (p95 < 1 s warm; cache hit > 90%).
  • k6 booking burst passes (confirm p95 < 1.5 s; success > 99%).
  • Long-soak passes 8 h.
  • /bootstrap p95 < 350 ms warm.
  • /confirm correctness: zero double-confirms in chaos drill.

5. Observability

  • All SLIs emitting; SLOs declared.
  • Dashboards: Funnel health, Service SLO, Booking flow, Per-tenant.
  • Alerts have ack'd runbooks.
  • Trace-tag coverage verified (tenant.id, route, draft.id, handoff.id, payment.intent.id).
  • Log fields verified.
  • PII filter verified (no raw email/phone/name in logs or telemetry).

6. Security

  • Threat model reviewed (SECURITY_MODEL §14).
  • Secrets in Secret Manager only.
  • HMAC key rotation drill done in stage in last 90 days; previous-key 7-day overlap honored.
  • Cloud Armor active; bot rules enabled.
  • reCAPTCHA Enterprise verified end-to-end.
  • DAST report has zero high/critical.
  • pnpm audit clean.
  • Trivy scan clean.
  • Cosign signature verified by binary authorization.
  • Cookie attributes verified (HttpOnly; Secure; SameSite=Lax; Path=/api).
  • CORS allow-list verified for prod consumer-tenant origins.
  • CSP nonce uniqueness verified under load.
  • Penetration test signed off in last 12 months.
  • Tenant-isolation tests pass at all layers.

7. Reliability

  • Cloud Run min instances = 3 / region.
  • Multi-region: primary asia-south1, DR-warm europe-west4.
  • DR drill executed in stage in last 90 days; RTO ≤ 30 min.
  • Circuit breakers configured for every upstream.
  • Per-route deadline + retry policy reviewed.
  • Two Memorystore tiers (cache + session) configured with HA.
  • Cloud SQL HA + cross-region read replica.
  • Custom-domain TLS auto-renewal verified for canary tenant.

8. Release process

  • CI: lint, typecheck, unit, integration, contract, build, scan, sign, deploy-dev, smoke.
  • Canary 5% / 25% / 100% with metric guardrails.
  • Rollback budget ≤ 5 min.
  • Feature flags documented; default off.
  • Release notes drafted.

9. Operations

  • On-call rotation assigned (Frontend Platform).
  • PagerDuty escalation policy verified.
  • Runbooks present for: F-1, F-6, F-7, F-9, F-15, F-16, F-26, F-28 (per FAILURE_MODES).
  • Cost dashboard with budget alerts at 50/80/100/120%.
  • On-call handoff doc in this folder.
  • Backup + restore tested for Cloud SQL.

10. Compliance / data governance

  • PII inventory in SECURITY_MODEL §11 reviewed by data steward.
  • DPIA filed for booking-time PII collection.
  • Cookie consent flow integrated with tenant booking app (no telemetry until consent in EU).
  • Data retention enforced: BookingDraftSnapshot 30 d, handoff_arrival_log 30 d, idempotency 24 h.
  • Sharia-compliant tenants flagged with complianceProfile; AI suggestions filtered accordingly.

11. Sign-off

RoleNameDateSignature
Service tech lead (Frontend Platform)
SRE on-call (rotating)
Security reviewer
Data steward
Product manager (booking flow owner)
Eng manager / Director

A snapshot of this checklist is committed to services/bff-tenant-booking-service/_readiness/<release-tag>.md.