Skip to main content

SERVICE_READINESS — payment-gateway-service

Sibling: DEPLOYMENT_TOPOLOGY · SECURITY_MODEL · TESTING_STRATEGY · SERVICE_RISK_REGISTER

Standard: DEFINITION_OF_DONE

This document is the production-readiness gate for payment-gateway-service. Every box must be checked and dated before the service is allowed into a production GCP project. Updates are tracked via PR.

1. Architecture & domain

  • DOMAIN_MODEL ratified by domain group (sign-off: TPM + SecOps).
  • APPLICATION_LOGIC use-case list fully implemented; each has unit + integration tests at thresholds.
  • All aggregates have explicit state machines and transition tests.
  • No use case crosses tenant boundaries; per-tenant role separation enforced in DB.

2. PCI-DSS SAQ A

  • Annual SAQ A signed by SecOps lead and on file.
  • Quarterly ASV scan green within last 90 days.
  • Annual penetration test executed; all high/critical findings remediated; medium findings tracked with deadlines.
  • PCI scanner CI gate green on main; zero PAN-shaped or forbidden field findings.
  • Redaction proxy verified in production by replay of synthetic PAN-shaped log line (must be stripped).
  • Vendor AOCs (Stripe, PayPal, HesabPay, FX provider) collected within last 12 months.

3. Security

  • Secrets in Secret Manager only; no secrets in env files in production manifests.
  • Workload Identity bound; no JSON service-account keys mounted.
  • CMEK in place for payment_methods token storage and webhook GCS bucket.
  • Cloud Armor + WAF active on the webhook subdomain with vendor IP allowlists.
  • mTLS enforced on inter-service calls.
  • All admin endpoints require dual sign-off and are logged to audit-log-service.

4. Reliability

  • SLOs defined (see OBSERVABILITY); error budget policy approved.
  • Circuit breakers and per-tenant adapter precedence verified via chaos test.
  • DR drill completed within last 90 days (region failover, DB restore).
  • Backup PITR verified by a test restore in staging.
  • Outbox + inbox patterns covered by integration tests (lost-event, duplicate, replay).
  • HPA min/max settings tuned and documented; load test sustains 100 RPS authorize at p99 < 1500 ms.

5. Vendor canaries

  • Stripe sandbox canary runs hourly; last 7 days green.
  • PayPal sandbox canary runs hourly; last 7 days green.
  • HesabPay sandbox canary runs hourly; last 7 days green.
  • Each canary tests authorize + capture + refund + webhook arrival.
  • On three consecutive canary failures, deploy promotion is blocked automatically.

6. Observability

  • All SLIs in OBSERVABILITY emit in production.
  • Cloud Monitoring workspace payment-gateway-service linked from runbook index.
  • Alert routes tested via synthetic firing (paged on-call, ack within 5 m).
  • Trace coverage > 95% on mutating paths; verified in staging for 7 days pre-promotion.
  • Audit-log-service receives every domain mutation; reconciled vs DB row count daily.

7. Documentation

  • All 18 service documents up to date and indexed in docs/03-microservices/payment-gateway-service.md.
  • Runbooks exist for every alert in OBSERVABILITY §5.
  • OpenAPI spec generated, committed, and linked from API_CONTRACTS §11.
  • Event schemas registered and validated in CI.

8. Operational

  • On-call rotation set up; primary, secondary, escalation path documented.
  • Tenant onboarding runbook tested end-to-end with a synthetic tenant.
  • Tenant offboarding runbook tested with a synthetic tenant; secrets destruction verified.
  • payments-admin-cli packaged and distributed to ops.

9. Sign-offs

RoleNameDate
Service owner / TPM
SecOps lead (PCI)
SRE on-call rep
Domain lead — Payments
Tenant ops lead

The service is not in production scope until all of the above are complete. A weekly readiness review continues until launch; thereafter quarterly.