SERVICE_READINESS — staff-service
Sibling: SERVICE_OVERVIEW · TESTING_STRATEGY · OBSERVABILITY · SECURITY_MODEL
Standard: docs/standards/DEFINITION_OF_DONE
A go/no-go checklist for promoting staff-service to production. Each row has a status (✅ ready / 🟡 in-progress / ❌ blocked / N/A), an owner, and a verification artifact. The service is not considered production-ready until every required (R) row is ✅.
1. Functional Readiness (R)
| Item | Status | Verification |
|---|
| All 18 documents present and reviewed | ✅ | This bundle |
All Staff, Shift, ClockEntry, LeaveRequest use cases shipped | ✅ | services/staff-service/src/application/use-cases/* |
| All published events emitted via outbox | ✅ | EVENT_SCHEMAS.md; integration test outbox.spec.ts |
| All consumed events handled with inbox dedupe + DLQ | ✅ | inbox.spec.ts, dlq.spec.ts |
| OpenAPI complete and matches handlers | ✅ | openapi-schema-vs-router.spec.ts |
| BFF contract tests green | ✅ | Pact broker staff-service ↔ bff-backoffice |
Multi-language (ps, fa, en, ar) labels validated | ✅ | localized-label.spec.ts |
2. Data & Storage (R)
| Item | Status | Verification |
|---|
Schema staff deployed with all tables | ✅ | Flyway V001..V032 |
| Row-Level Security enforced on every multi-tenant table | ✅ | rls-isolation.spec.ts (cross-tenant sweep) |
Append-only triggers on clock_entries, handoff_notes, audit_events | ✅ | append-only-trigger.spec.ts |
| Indexes for hot paths created and analyzed | ✅ | pg_stat_user_indexes review |
| Cloud SQL HA + PITR enabled (prod) | ✅ | Terraform module.staff_db.ha_enabled = true |
| CMEK on all storage layers | ✅ | KMS keyring staff-prod-me |
| BigQuery cold copy + retention 7 y for audit | ✅ | Datastream config |
| Backup restore drill executed | 🟡 | Quarterly drill — last run pending |
3. Security (R)
| Item | Status | Verification |
|---|
| RBAC matrix implemented | ✅ | rbac.spec.ts |
| ABAC overlays (property + department) enforced | ✅ | abac-property.spec.ts |
| PIN HMAC + KMS pepper rotation tested | ✅ | pin-pepper-rotation.spec.ts |
Field-level encryption (emergency_contact) verified | ✅ | field-level-crypto.spec.ts |
| PIN brute-force lockout active | ✅ | Manual + pin-brute-force.spec.ts |
| DSAR export job operational; benchmarked < 24 h SLA | ✅ | dsar-export-benchmark.spec.ts |
| Threat model reviewed | ✅ | SECURITY_MODEL §10; security review sign-off |
| Penetration test passed (no Critical / High open) | ✅ | Q3-26 pentest report pen-2026-q3-staff.pdf |
| Secrets only in Secret Manager (no env-baked secrets) | ✅ | CI no-baked-secrets.spec.ts |
4. Observability (R)
| Item | Status | Verification |
|---|
| All SLIs reported to Cloud Monitoring | ✅ | OBSERVABILITY §3 |
Dashboards published (staff/overview, clock-in, outbox, inbox, leave, ai) | ✅ | Cloud Monitoring console |
| Pager-grade alerts wired with PagerDuty | ✅ | OBSERVABILITY §6.1 |
| Runbooks for every pager-grade alert | ✅ | runbooks/staff/*.md listed in FAILURE_MODES.md |
| Tracing with OpenTelemetry; traces visible in Cloud Trace | ✅ | Sampled trace from staging |
| JSON-structured PII-clean logs | ✅ | log-pii-leak.spec.ts |
| Audit log includes all PII reads + AI views + manager overrides | ✅ | audit-coverage.spec.ts |
| Item | Status | Verification |
|---|
| SLOs defined and approved | ✅ | OBSERVABILITY §1 |
| Load test at 2× peak (1000 punches/min, 50 concurrent shift gens) | ✅ | loadtest/staff-baseline.k6.js results |
Cold-start p95 < 2 s after min_instances=2 set | ✅ | Cloud Run metrics |
| Graceful shutdown verified (10 s drain) | ✅ | graceful-shutdown.spec.ts |
| Circuit breakers on outbound calls (IAM, property, AI) | ✅ | circuit-breaker.spec.ts |
| Idempotency-Key store implemented + 24 h TTL | ✅ | idempotency.spec.ts |
6. Sync & Offline (R for Electron release)
| Item | Status | Verification |
|---|
| Sync pull/push contracts implemented | ✅ | SYNC_CONTRACT §4 |
| Conflict policies validated for every replicated aggregate | ✅ | sync-conflict.spec.ts |
| Offline punch with PIN works at front desk | ✅ | E2E electron-offline-punch.spec.ts |
| Multi-device punch collision auto-out | ✅ | multi-device-punch.spec.ts |
| Late-replay punch flagging | ✅ | late-replay.spec.ts |
| Compliance gates (terminated staff blocked from sync) verified | ✅ | sync-compliance.spec.ts |
7. AI Integration (Recommended, not blocking)
| Item | Status | Verification |
|---|
| All AI surfaces are advisory (no auto-apply) | ✅ | AI_INTEGRATION §2 |
| Audit row written on every AI suggestion view + action | ✅ | audit-ai-coverage.spec.ts |
| Tenant opt-out toggle respected | ✅ | ai-opt-out.spec.ts |
| Edge anomaly model FPR baselined < 8 % | 🟡 | First model trained on 90 d data; baseline pending pilot |
| Bias audit completed quarterly | 🟡 | Scheduled Q1-27 |
8. Compliance (R for tenants under data-residency mandates)
| Item | Status | Verification |
|---|
| GDPR DSAR (export + erase) under 30 days | ✅ | dsar-export-benchmark.spec.ts |
Data residency: me tenants stay in me-central1 | ✅ | Per-tenant Cloud SQL + KMS keyring |
| Right-to-erasure with audit retention exception documented | ✅ | SECURITY_MODEL §11 |
| Audit retention 7 y in BigQuery | ✅ | Datastream + table TTL |
| Manager-override punch policy reviewed by Legal | ✅ | Approval ticket LEG-441 |
| Multilingual UI for staff (4 languages) | ✅ | i18n.spec.ts |
9. Operational
| Item | Status | Verification |
|---|
On-call rotation defined (PagerDuty schedule staff-svc-oncall) | ✅ | PagerDuty |
| Runbook drills (3 of 16 failure modes drilled in staging) | 🟡 | Schedule completion of remaining drills |
| Postmortem template and storage path agreed | ✅ | runbooks/staff/postmortems/ |
| Deploy pipeline 10 % → 50 % → 100 % traffic shift validated | ✅ | Cloud Deploy delivery pipeline |
| Rollback drill executed | ✅ | Staging drill 2026-04-08 |
| Migration forward-only documented | ✅ | DEPLOYMENT_TOPOLOGY §8 |
10. Documentation
| Item | Status | Verification |
|---|
Catalog entry docs/03-microservices/staff-service.md | ✅ | Indexed in SERVICE_INDEX.md |
| All 17 service-level docs cross-linked | ✅ | This bundle |
| Onboarding doc covers < 5 min "hello world" | ✅ | LOCAL_DEV_SETUP.md |
| Risk register reviewed quarterly | ✅ | SERVICE_RISK_REGISTER.md |
| Migration plan from MVP → M2 documented | ✅ | MIGRATION_PLAN.md |
11. Sign-off
| Role | Name placeholder | Date | Decision |
|---|
| Service tech lead | TBD | | TBD |
| Platform architect | TBD | | TBD |
| Security architect | TBD | | TBD |
| SRE on-call lead | TBD | | TBD |
| Product owner | TBD | | TBD |
The service is GA when all required rows above are ✅ and all five sign-offs are recorded with a date stamp. Open 🟡 / ❌ items must move to ✅, be re-scoped, or be explicitly waived in the risk register.
12. Quarterly Re-Certification
SERVICE_READINESS.md is re-walked every quarter. Any item flipping from ✅ → 🟡 / ❌ triggers an incident-style review and a remediation milestone.