Deployment Topology
:::info Source
Sourced from services/authoring-service/12-DEPLOYMENT_TOPOLOGY.md in the documentation repo.
:::
1. Containers
| Container | Purpose |
|---|---|
authoring-api | REST API for drafts, modules, lessons, blocks |
authoring-collab | WebSocket Yjs awareness server (M5) |
authoring-worker | Outbox relay, publish-saga coordinator, SCORM import worker |
authoring-ai-worker | AI co-author request fan-out (via AIClient) |
2. Scaling Rules
| Container | Min | Max | HPA Trigger |
|---|---|---|---|
authoring-api | 3 | 30 | CPU > 60% or active-editors > 100/pod |
authoring-collab | 2 | 20 | active WS connections > 500/pod |
authoring-worker | 2 | 10 | outbox backlog > 5000 |
authoring-ai-worker | 2 | 15 | AI request queue > 200 |
- Collab server sticky-session by
draftIdhash (consistent hashing). - API stateless; collab stateful via Redis-backed Yjs awareness coordination.
3. Resource Requirements
| Workload | CPU req/limit | Mem req/limit |
|---|---|---|
authoring-api | 500m / 2000m | 512Mi / 1.5Gi |
authoring-collab | 500m / 2000m | 1Gi / 4Gi (Yjs docs in memory) |
authoring-worker | 200m / 1000m | 256Mi / 1Gi |
authoring-ai-worker | 300m / 1500m | 512Mi / 2Gi |
4. Caching & Storage
| Layer | Contents | TTL |
|---|---|---|
| Redis (per region) | Draft locks, Yjs awareness state, AI result cache | sliding |
| Postgres pgbouncer | Transaction-mode pool | — |
| S3/R2 | Draft snapshots (for forensics), SCORM import staging | 90 days |
| CDN | Block previews, thumbnails | 5 min |
5. CDN Usage
- Preview URLs for draft blocks (read-only, signed, TTL 10 min).
- SCORM import package staging (owner-only signed URL).
6. Regional Deployment
- Per-region: US, EU, ME, AP.
- Drafts live in tenant's
homeRegion. - AI calls route through regional ai-gateway → fall back to local model on budget exhaustion.
7. Service Mesh
- mTLS within cluster.
- Egress to media-service, content-service, ai-gateway, catalog-service.
- WebSocket connections through regional ingress (websocket-aware LB).
8. Release Strategy
- Blue/green for API.
- Collab server: drain sessions (notify clients 5 min before) → recreate.
- Workers: rolling; outbox is idempotent.
9. Topology Diagram
Browser ─ HTTPS ─▶ CDN ─▶ API GW ─▶ authoring-api (3–30)
│ │
│ ├─▶ Postgres (drafts)
│ ├─▶ Redis (locks, AI cache)
│ ├─▶ ai-gateway-service
│ ├─▶ media-service
│ └─▶ NATS (outbox → publish saga)
│
└─ WSS ──▶ authoring-collab (2–20, sticky) ─▶ Redis (Yjs awareness)
Publish saga steps are coordinated by `authoring-worker` consuming
`content.play_package.built.v1` + `catalog.course_version.published.v1`.
10. DR
- RPO 5 min, RTO 60 min.
- Draft snapshots hourly to cold storage.
- Live collab sessions rebuildable from event log + last snapshot.