SERVICE_READINESS — payment-gateway-service
Sibling: DEPLOYMENT_TOPOLOGY · SECURITY_MODEL · TESTING_STRATEGY · SERVICE_RISK_REGISTER
Standard: DEFINITION_OF_DONE
This document is the production-readiness gate for payment-gateway-service. Every box must be checked and dated before the service is allowed into a production GCP project. Updates are tracked via PR.
1. Architecture & domain
- DOMAIN_MODEL ratified by domain group (sign-off: TPM + SecOps).
- APPLICATION_LOGIC use-case list fully implemented; each has unit + integration tests at thresholds.
- All aggregates have explicit state machines and transition tests.
- No use case crosses tenant boundaries; per-tenant role separation enforced in DB.
2. PCI-DSS SAQ A
- Annual SAQ A signed by SecOps lead and on file.
- Quarterly ASV scan green within last 90 days.
- Annual penetration test executed; all
high/criticalfindings remediated; medium findings tracked with deadlines. - PCI scanner CI gate green on
main; zero PAN-shaped or forbidden field findings. - Redaction proxy verified in production by replay of synthetic PAN-shaped log line (must be stripped).
- Vendor AOCs (Stripe, PayPal, HesabPay, FX provider) collected within last 12 months.
3. Security
- Secrets in Secret Manager only; no secrets in env files in production manifests.
- Workload Identity bound; no JSON service-account keys mounted.
- CMEK in place for
payment_methodstoken storage and webhook GCS bucket. - Cloud Armor + WAF active on the webhook subdomain with vendor IP allowlists.
- mTLS enforced on inter-service calls.
- All admin endpoints require dual sign-off and are logged to
audit-log-service.
4. Reliability
- SLOs defined (see OBSERVABILITY); error budget policy approved.
- Circuit breakers and per-tenant adapter precedence verified via chaos test.
- DR drill completed within last 90 days (region failover, DB restore).
- Backup PITR verified by a test restore in
staging. - Outbox + inbox patterns covered by integration tests (lost-event, duplicate, replay).
- HPA min/max settings tuned and documented; load test sustains 100 RPS authorize at p99 < 1500 ms.
5. Vendor canaries
- Stripe sandbox canary runs hourly; last 7 days green.
- PayPal sandbox canary runs hourly; last 7 days green.
- HesabPay sandbox canary runs hourly; last 7 days green.
- Each canary tests authorize + capture + refund + webhook arrival.
- On three consecutive canary failures, deploy promotion is blocked automatically.
6. Observability
- All SLIs in OBSERVABILITY emit in production.
- Cloud Monitoring workspace
payment-gateway-servicelinked from runbook index. - Alert routes tested via synthetic firing (paged on-call, ack within 5 m).
- Trace coverage > 95% on mutating paths; verified in
stagingfor 7 days pre-promotion. - Audit-log-service receives every domain mutation; reconciled vs DB row count daily.
7. Documentation
- All 18 service documents up to date and indexed in
docs/03-microservices/payment-gateway-service.md. - Runbooks exist for every alert in OBSERVABILITY §5.
- OpenAPI spec generated, committed, and linked from API_CONTRACTS §11.
- Event schemas registered and validated in CI.
8. Operational
- On-call rotation set up; primary, secondary, escalation path documented.
- Tenant onboarding runbook tested end-to-end with a synthetic tenant.
- Tenant offboarding runbook tested with a synthetic tenant; secrets destruction verified.
-
payments-admin-clipackaged and distributed to ops.
9. Sign-offs
| Role | Name | Date |
|---|---|---|
| Service owner / TPM | ||
| SecOps lead (PCI) | ||
| SRE on-call rep | ||
| Domain lead — Payments | ||
| Tenant ops lead |
The service is not in production scope until all of the above are complete. A weekly readiness review continues until launch; thereafter quarterly.