Deployment Topology
:::info Source
Sourced from services/assignment-service/DEPLOYMENT_TOPOLOGY.md in the documentation repo.
:::
Companion: 01 Enterprise Architecture · 15 Observability
1. Deployment Unit
The service ships as a single OCI image with four roles selectable via APP_ROLE:
| Role | Process | Replicas (prod) | HPA trigger |
|---|---|---|---|
api | Fastify HTTP API | 4–24 | CPU 60% / RPS |
worker | Consumers + outbox publisher | 3–12 | Queue lag |
scheduler | Materializer + sweepers + reminder planner | 2 (leader-elected) | — |
ai-suggest-worker | AI call handler (bulkheaded) | 1–4 | AI queue length |
All roles share the same image, same codebase, but differ in the entry subcommand.
2. Kubernetes Layout
namespace: assignment
│
├─ Deployment/assignment-api (4 replicas)
├─ Deployment/assignment-worker (3 replicas)
├─ StatefulSet/assignment-scheduler (2 replicas, leader-elected via lease)
├─ Deployment/assignment-ai-suggest (1 replica)
├─ Service/assignment-api (ClusterIP, exposed via ingress gateway)
├─ HPA/assignment-api
├─ HPA/assignment-worker
├─ PodDisruptionBudget/assignment-api (minAvailable=2)
├─ PodDisruptionBudget/assignment-worker(minAvailable=1)
├─ NetworkPolicy (see §6)
├─ ServiceMonitor (prometheus-operator → SigNoz)
├─ ConfigMap/assignment-config
└─ ExternalSecret/assignment-secrets (ESO → AWS Secrets Manager)
3. Topology
┌───────────────────────────┐
│ API Gateway / WAF │
└─────────────┬─────────────┘
│
┌─────────────▼─────────────┐
│ assignment-api (Pods) │
└─┬────────────┬────────────┘
│ │
┌─────────▼──┐ ┌────▼───────┐
│ Postgres │ │ Redis │
│ (Aurora) │ │ (cluster) │
└─────────┬──┘ └────────────┘
│
▼
┌──────────────┐
│ NATS │
│ JetStream │
└──────┬───────┘
│
┌───────────────────┼───────────────────┐
│ │ │
┌─▼──────────┐ ┌─────▼──────┐ ┌────────▼─────────┐
│ worker │ │ scheduler │ │ ai-suggest │
└────────────┘ └────────────┘ └───────┬──────────┘
│
┌─────────▼─────────┐
│ ai-gateway-svc │
└───────────────────┘
4. Environments
| Env | Region | Cluster | Tenant mode | Purpose |
|---|---|---|---|---|
dev | local / us-west | 1x | any | loops & debug |
ci | ephemeral | 1x | synthetic | CI verification |
staging | us-east-1 | 1x primary | all tenants shadow + synthetic | soak, perf, pre-GA |
prod-us | us-east-1 + us-west-2 | active/active | production | US tenants |
prod-eu | eu-west-1 | active/passive | production | EU tenants (data residency) |
prod-me | me-central-1 | active/passive | production | MENA tenants |
Traffic pinned per tenant to home region via tenant-service metadata.
5. Scaling Profile
Baseline (p50 tenant):
| Role | CPU req | Mem req | Limits |
|---|---|---|---|
| api | 500 m | 512 Mi | 2 / 1 Gi |
| worker | 500 m | 512 Mi | 2 / 1 Gi |
| scheduler | 250 m | 256 Mi | 1 / 512 Mi |
| ai-suggest | 250 m | 256 Mi | 1 / 512 Mi |
HPA metrics:
- api: CPU 60%, custom
http_requests_in_flight - worker: custom
nats_consumer_pending> 1000 - ai-suggest: custom
ai_suggest_queue> 10
6. Network Policy
- Inbound: only from
gatewaynamespace on 8080; fromprometheuson 9464; from peer services via mesh. - Outbound: Postgres, Redis, NATS, ai-gateway-service, notification-service, tenant-service, catalog-service only.
- No internet egress.
7. Secrets & Config
External Secrets Operator syncs from AWS Secrets Manager:
POSTGRES_URLREDIS_URLNATS_CREDSINTERNAL_SVC_JWT_SIGNING_KEYOTEL_EXPORTER_OTLP_ENDPOINTAI_GATEWAY_URL,AI_GATEWAY_TOKEN
Config (ConfigMap):
RRULE_HORIZON_DAYS=90MATERIALIZER_BATCH_SIZE=1000OVERDUE_SWEEP_INTERVAL=5mCLOSED_MISSED_SWEEP_INTERVAL=15mREMINDER_BATCH_SIZE=500FEATURE_AI_SUGGEST=on|off
Feature flags via launchdarkly-compatible SDK (LaunchDarkly or OpenFeature + Flagd).
8. Release Pipeline
main branch push
↓
Build → Test → Image → Sign (cosign) → SBOM (cyclonedx)
↓
Deploy to staging (canary 10% → 100%)
↓
Soak 48h + synthetic checks
↓
Gated approval (compliance_admin of ops)
↓
Deploy to prod-us (canary 1% → 10% → 50% → 100%)
↓
(t+24h) prod-eu, prod-me same staged rollout
Rollback: kubectl rollout undo + DB schema is backward-compat by policy. Outbox + saga safe because consumers are version-tolerant.
9. DR / BCP
- RPO: 5 min (PITR + NATS JetStream replication).
- RTO: 30 min (cross-region failover runbook).
- Quarterly DR drill: restore staging in alternate region from last backup; validate via synthetic tenant.
10. Tenant Sharding Strategy
Single logical Postgres cluster per region; tenant LIST partitioning on compliance_window gives isolation without multiple DBs. If a single tenant exceeds 100M windows, we promote to its own dedicated cluster via standby-promotion runbook.
11. Deployment Pre-checks
Automated gate runs before every prod deploy:
- All DB migrations reversible or forward-compat.
- Event schema compat check against registered consumers.
- No freeze-point violation (F25/F26 require RFC before change).
12. Resource Budget
Projected p95 load (M5):
- 100 tenants × 5k windows/month active
- 500 rps API peak (combined)
- 5k events/s peak during materializer bursts
Measured fit: 8 api pods × 2 CPU, 6 worker pods × 2 CPU — well under cluster budget.