DEPLOYMENT_TOPOLOGY — billing-service
All workloads run on Google Cloud Platform in the same region as the tenant data plane. The service ships as four runtime artifacts produced from one repository, three of them on Cloud Run and one as a Cloud Run Job.
1. Workloads
| Workload | Runtime | Purpose | Min / Max replicas | Concurrency |
|---|---|---|---|---|
billing-api | Cloud Run service | REST API + event consumers (in-process subscribers) | 3 / 50 | 80 |
billing-outbox-drainer | Cloud Run service | Drains _outbox → Pub/Sub for both per-tenant and central schemas | 2 / 10 | 1 (single-flight per shard) |
billing-tenant-migrator | Cloud Run Job | Per-tenant schema provisioning + forward DDL on tenant.created.v1 and on release | invoked per tenant | n/a |
billing-subscription-cycle | Cloud Run Job (Cloud Scheduler) | Monthly subscription billing cycle per tenant fanout | invoked monthly + retry | n/a |
billing-cash-analytics-job | Cloud Run Job (Cloud Scheduler) | Nightly daily reconciliation + AI cash pattern detection | nightly | n/a |
2. Region & data residency
- Primary region:
asia-south1(Mumbai) — same as Cloud SQL primary. - DR region:
europe-west3(Frankfurt) — Cloud SQL cross-region replica + Cloud Run secondary deployments held in standby. - Cross-region failover is operator-driven; RTO ≤ 15 min, RPO ≤ 1 min.
- Per-tenant data residency overrides (e.g., a Saudi tenant requesting
me-central1) are reserved for v2 — see SERVICE_RISK_REGISTER.
3. Networking
- All Cloud Run services share a single VPC connector and serverless VPC access.
- Cloud SQL reachable via Private IP only (no public IP).
- Pub/Sub via private Google access.
- mTLS within the VPC for service-to-service (
payment-gateway-service,iam-service,notification-service,file-storage-service,ai-orchestrator-service).
4. Containers
- Base image: distroless
gcr.io/distroless/nodejs22-debian12. - Multi-stage Dockerfile: install deps with
pnpm install --frozen-lockfile, buildtsc, prune dev deps, shipdist/. - Image size target: ≤ 200 MB.
- SBOM produced at build (
syft);grypescan blocks on critical CVEs. - Image signed with
cosign; Cloud Run admission policy verifies signature.
5. Configuration
- Configuration is environment-driven via Workload Identity-mounted secrets:
DB_HOST,DB_PORT,DB_USER,DB_PASSWORD(via Secret Manager)PUBSUB_PROJECT,PUBSUB_OUTBOX_TOPICS(CSV)IAM_JWKS_URL,IAM_STEPUP_AUDIENCE=billing-serviceAI_ORCHESTRATOR_URL,AI_BUDGET_DEFAULT_QPSFILE_STORAGE_BUCKET_PATTERN=billing-invoices-{tenantId}OBSERVABILITY_OTLP_ENDPOINT
- Tenant-specific config lives in
tenant-service(tenant.settings.billing.*) and is fetched per request through theTenantSettingsClientwith a 5-minute Memorystore cache.
6. Resource sizing
| Workload | CPU | Memory | Notes |
|---|---|---|---|
billing-api | 2 vCPU | 2 GiB | PDF render shares the request worker; cap at 2 concurrent renders per instance |
billing-outbox-drainer | 1 vCPU | 512 MiB | one shard per per-tenant schema chunk + 1 shard for central |
billing-tenant-migrator | 1 vCPU | 512 MiB | runs to completion |
billing-subscription-cycle | 2 vCPU | 1 GiB | parallel fanout per region with bounded worker pool (16) |
billing-cash-analytics-job | 2 vCPU | 1 GiB | nightly window |
| Cloud SQL Postgres 16 | 8 vCPU / 32 GiB starter, scale on IO | dedicated SSD | HA + cross-region replica + PITR 7d |
7. Release & deployment
- Trunk-based on
main; feature branches → PR → merge queue. - Cloud Build pipeline: lint → typecheck → unit → application → integration → contract → build image → cosign sign → push → deploy to staging → smoke E2E → manual gate to prod.
- Prod deployment uses Cloud Run revision traffic split for canary: 5% → 25% → 100% with 10-min soak between steps; an SLO burn alert auto-rolls-back.
- Per-tenant migrator runs before rolling the API revision when the migration adds columns or constraints; backward-compatible migrations only (the standard "expand → cleanup" pattern across 2 releases).
- Cloud Scheduler jobs paused during release; resumed once
billing-apirevision reaches 100%.
8. Pub/Sub topology
| Topic | Partitions | Subscribers | Retention | DLQ |
|---|---|---|---|---|
melmastoon.billing.folio | tenant-attribute partitioning | analytics, reporting, audit, sync, bff-backoffice | 7 d | yes |
melmastoon.billing.invoice | same | notification, analytics, reporting, audit | 7 d | yes |
melmastoon.billing.cash_drawer | same | audit, reporting, bff-backoffice, notification | 7 d | yes |
melmastoon.billing.subscription | tenant + global | tenant-service, notification, analytics, reporting, audit | 7 d | yes |
melmastoon.billing.usage | tenant | analytics, reporting, billing | 3 d | yes |
<topic>.dlq | n/a | on-call manual triage | 14 d | n/a |
9. Health checks
/healthz— liveness; returns 200 if process is up./readyz— readiness; checks Cloud SQL connectivity, Pub/Sub publisher init, Memorystore reachability; returns 503 on degraded dependency to drain traffic./version— returns git sha + build time.
10. Auto-scaling
billing-apiscales on concurrency (target 60) and CPU > 60%.billing-outbox-drainerscales on outbox lag custom metric (target ≤ 10 s).- Cloud Scheduler jobs scale internally via worker pool.
11. Cost guardrails
- Per-tenant Cloud SQL storage budget alert at 80% of plan-included quota.
- Pub/Sub publish cost dashboard split per topic.
- Cloud Run min-instance bill capped via FinOps review; emergency override via runbook.
12. Cross-references
- Networking & infra: platform infra repo (out of scope here).
- Cron / job catalog: docs/02 §9.
- Migrator detail: MIGRATION_PLAN.
- Security & secret rotation: SECURITY_MODEL §9.