Skip to main content

Deployment Topology

:::info Source Sourced from services/authoring-service/12-DEPLOYMENT_TOPOLOGY.md in the documentation repo. :::

1. Containers

ContainerPurpose
authoring-apiREST API for drafts, modules, lessons, blocks
authoring-collabWebSocket Yjs awareness server (M5)
authoring-workerOutbox relay, publish-saga coordinator, SCORM import worker
authoring-ai-workerAI co-author request fan-out (via AIClient)

2. Scaling Rules

ContainerMinMaxHPA Trigger
authoring-api330CPU > 60% or active-editors > 100/pod
authoring-collab220active WS connections > 500/pod
authoring-worker210outbox backlog > 5000
authoring-ai-worker215AI request queue > 200
  • Collab server sticky-session by draftId hash (consistent hashing).
  • API stateless; collab stateful via Redis-backed Yjs awareness coordination.

3. Resource Requirements

WorkloadCPU req/limitMem req/limit
authoring-api500m / 2000m512Mi / 1.5Gi
authoring-collab500m / 2000m1Gi / 4Gi (Yjs docs in memory)
authoring-worker200m / 1000m256Mi / 1Gi
authoring-ai-worker300m / 1500m512Mi / 2Gi

4. Caching & Storage

LayerContentsTTL
Redis (per region)Draft locks, Yjs awareness state, AI result cachesliding
Postgres pgbouncerTransaction-mode pool
S3/R2Draft snapshots (for forensics), SCORM import staging90 days
CDNBlock previews, thumbnails5 min

5. CDN Usage

  • Preview URLs for draft blocks (read-only, signed, TTL 10 min).
  • SCORM import package staging (owner-only signed URL).

6. Regional Deployment

  • Per-region: US, EU, ME, AP.
  • Drafts live in tenant's homeRegion.
  • AI calls route through regional ai-gateway → fall back to local model on budget exhaustion.

7. Service Mesh

  • mTLS within cluster.
  • Egress to media-service, content-service, ai-gateway, catalog-service.
  • WebSocket connections through regional ingress (websocket-aware LB).

8. Release Strategy

  • Blue/green for API.
  • Collab server: drain sessions (notify clients 5 min before) → recreate.
  • Workers: rolling; outbox is idempotent.

9. Topology Diagram

Browser ─ HTTPS ─▶ CDN ─▶ API GW ─▶ authoring-api (3–30)
│ │
│ ├─▶ Postgres (drafts)
│ ├─▶ Redis (locks, AI cache)
│ ├─▶ ai-gateway-service
│ ├─▶ media-service
│ └─▶ NATS (outbox → publish saga)

└─ WSS ──▶ authoring-collab (2–20, sticky) ─▶ Redis (Yjs awareness)

Publish saga steps are coordinated by `authoring-worker` consuming
`content.play_package.built.v1` + `catalog.course_version.published.v1`.

10. DR

  • RPO 5 min, RTO 60 min.
  • Draft snapshots hourly to cold storage.
  • Live collab sessions rebuildable from event log + last snapshot.