Skip to main content

housekeeping-service — SERVICE_READINESS

Production-readiness checklist. Service ships when every mandatory item is checked. Conditional items must be checked or have an ADR-tracked exception.

Legend: [x] ready · [ ] open · (M) mandatory · (C) conditional.


1. Documentation

  • (M) [ ] All 17 service docs present in services/housekeeping-service/.
  • (M) [ ] Public summary present at docs/03-microservices/housekeeping-service.md.
  • (M) [ ] ADRs filed for non-default decisions (HITL gate default, partition strategy, sync conflict policy table).
  • (M) [ ] Runbooks in runbooks/ for every alert in OBSERVABILITY.md §7.
  • (C) [ ] Architecture diagram (docs/03-microservices/housekeeping-service.md) reviewed by platform architect.

2. Code quality

  • (M) [ ] TypeScript strict; noUncheckedIndexedAccess on; 0 type errors.
  • (M) [ ] ESLint 0 errors; security plugin clean.
  • (M) [ ] madge --circular reports no cycles.
  • (M) [ ] Domain layer has zero imports from infrastructure/presentation.

3. Tests

  • (M) [ ] Domain unit tests ≥ 95% lines / ≥ 90% branches.
  • (M) [ ] Application unit tests ≥ 85% lines / ≥ 80% branches.
  • (M) [ ] Contract tests cover 100% of OpenAPI operations.
  • (M) [ ] Integration core suite green (tenant-isolation, outbox-relay, inbox-idempotency, turnover-saga, room-status-state-machine).
  • (M) [ ] Sync conflict-policy specs green for every aggregate.
  • (M) [ ] k6 perf smoke meets SLO targets in staging.
  • (C) [ ] Stryker mutation score ≥ 75% on domain.

4. APIs

  • (M) [ ] openapi.yaml published to central registry; matches controllers.
  • (M) [ ] Every error path returns a valid MELMASTOON.HOUSEKEEPING.* code.
  • (M) [ ] Idempotency-Key enforced on mutating endpoints.
  • (M) [ ] ETag emitted on aggregate reads.
  • (M) [ ] Rate limits configured at gateway and in-process.

5. Events

  • (M) [ ] All 20 published subjects validated against JSON schemas.
  • (M) [ ] All 9 consumed subscriptions configured with OIDC, ACK 60 s, 10 max deliveries, DLQ.
  • (M) [ ] Outbox relay deployed and observed.
  • (M) [ ] Inbox dedup verified via inbox-idempotency.spec.ts.
  • (C) [ ] Schema-evolution rule documented and CI-enforced.

6. Storage

  • (M) [ ] Migrations apply cleanly forward and backward on a fresh DB.
  • (M) [ ] RLS policies present on every table; tenant-isolation.spec.ts green.
  • (M) [ ] pg_partman rotation job scheduled; partitions exist for current + next 3 months.
  • (M) [ ] EXPLAIN shows partition pruning on hot queries.
  • (M) [ ] Backups + PITR configured; restore drill performed in staging.

7. Security

  • (M) [ ] JWT verification (signature + claims) at controller boundary.
  • (M) [ ] OIDC verification on /internal/events/*.
  • (M) [ ] Authorization matrix from SECURITY_MODEL.md §3 wired in @Roles() decorators and tested.
  • (M) [ ] Secret Manager wiring verified; no secrets in env or logs.
  • (M) [ ] gitleaks clean.
  • (C) [ ] Threat model reviewed by Security in current quarter.

8. Observability

  • (M) [ ] SLO file slo.yaml deployed; burn-rate alerts active.
  • (M) [ ] Grafana dashboards in housekeeping folder created from JSON in repo.
  • (M) [ ] Sentry project configured; PagerDuty escalation path set.
  • (M) [ ] Trace propagation verified end-to-end (event → handler → outbox).

9. Deployment

  • (M) [ ] Cloud Run service deployed in staging and prod regions.
  • (M) [ ] Cloud Run Jobs scheduled (4 schedulers + partition rotate + snapshot refresh).
  • (M) [ ] Cloud Deploy pipeline gating on smoke + integration core.
  • (M) [ ] Canary rollout with auto-rollback configured.
  • (C) [ ] DR drill in asia-southeast1 performed in current quarter.

10. Desktop / sync

  • (M) [ ] Sync endpoints deployed; conflict policies match SYNC_CONTRACT.md.
  • (M) [ ] Desktop renderer integration tested against staging.
  • (M) [ ] Cursor expiration → full re-sync verified.
  • (M) [ ] Local SQLite encryption verified on Windows / macOS / Linux builds.

11. AI

  • (M) [ ] Routing port wired; HITL gate default supervisor_approval.
  • (M) [ ] Fallback to manual mode on routing port unavailability verified.
  • (M) [ ] Audit row written on every applied suggestion.

12. Operational

  • (M) [ ] On-call rotation set in PagerDuty housekeeping.
  • (M) [ ] Slack channel #hk-ops with bot integrations.
  • (M) [ ] Runbooks linked from each alert.
  • (M) [ ] Cost guardrails alert configured.
  • (M) [ ] Game day in current quarter that exercised at least one DLQ replay.

13. Compliance

  • (M) [ ] DPIA reviewed for lost-and-found PII handling.
  • (M) [ ] Audit fan-out to audit-service verified.
  • (C) [ ] Tenant-data export path tested (lost-and-found + tasks for a given tenant).

14. Sign-off

RoleNameDate
Service owner
Engineering lead
Security
Platform
Operations on-call lead