DLR Processor — Service Readiness
Status: populated Owner: Platform Engineering Last updated: 2026-04-18
1. Definition of Ready (Before Sprint Start)
- Domain model documented and reviewed
- Event schemas published to schema registry
- Data model migrations written and reviewed
- Downstream contracts agreed with billing-service and webhook-dispatcher teams
- NATS stream
SMS_DLRprovisioned in all environments - Test data seeded in dev environment
2. Definition of Done (Before Merge)
- All unit and integration tests pass
- Coverage ≥ 80%
- Pact contracts verified
-
npm auditpasses (no CRITICAL/HIGH) - Trivy image scan clean
- PR reviewed by ≥ 1 engineer
- Flyway migrations tested on clean DB
- Feature flag added if partially complete
3. Production Readiness Checklist
Code Quality
- No
console.logor debug statements - No hardcoded secrets or configuration
- Error handling comprehensive at all layers
- Graceful shutdown implemented (SIGTERM → drain NATS consumer → close PG pool)
Observability
- All key metrics instrumented
- Structured log events for all processing paths
- OTLP trace spans cover full pipeline
- Grafana dashboard reviewed and approved
Operations
- Runbooks written for all FM-DLR-* failure modes
- Alert rules reviewed with SRE team
- On-call rotation informed of new service
- Deployment documented and tested in staging
Security
- Security review completed
- All secrets in Vault; none in source code or environment at build time
- NetworkPolicy applied and verified
-
restrictedPod Security Admission profile applied
4. Launch Phases
| Phase | Criteria | Rollout |
|---|---|---|
| Alpha | Internal test traffic only | 1 replica, staging |
| Beta | Canary 5% of production DLR traffic | 2 replicas, production |
| GA | Full production traffic | 3 replicas + HPA |