Slice Risk Register

:::info Source Sourced from docs/roadmap/slice-risk-register.md in the documentation repo. :::

Execution-layer companion to ROADMAP.md and 14 Risks & Trade-offs.

Risks framed per slice (S0–S6) so risk discussions are scheduled alongside work that might trigger them. Each row: severity (S1 critical · S2 high · S3 medium · S4 low), impact, mitigation, owner, dependencies.

S0 — Platform Foundation

ID	Risk	Sev	Impact	Mitigation	Owner	Dependency
S0-R1	Tenant isolation regression	S1	Cross-tenant data leak; contract terminations	Two-tenant CI suite; mandatory code review on any RLS policy change; pen-test; RLS bypass tests	Platform + Security	Postgres RLS framework; JWT + RequestContext
S0-R2	Event envelope drift	S2	Services fall out of sync; Pact breakage; refactor avalanche	Envelope frozen + schema registry CI gate; ADR for any change	Platform	Schema registry
S0-R3	AI gateway port contract churn	S2	19 services refactor	`AIClient` port frozen; version rule additive-only; adapter abstraction	AI Services	AI adapter tests
S0-R4	Sync protocol churn	S2	Every client rebuilds; offline bundles invalidated	`/sync/v1/` frozen; additive-only	Sync + Platform	Sync protocol ADR
S0-R5	KMS mis-configuration	S1	Loss of data confidentiality; DR failure	KMS key hierarchy + rotation design reviewed by Security; DR drill	Security + Platform	KMS vendor selection
S0-R6	OpenTelemetry overhead	S3	Unexpected latency or cost	Sampling; async exporters; dashboards for OTel health	SRE	—
S0-R7	Two-tenant test suite gaps	S1	Silent leaks slip past CI	Matrix-test every endpoint; property-based tests	Platform + Security	CI infra
S0-R8	Over-building the foundation	S2	Time-to-M1 slips	Strict M0 scope doc; weekly backlog review	PM + Platform lead	—

S1 — Minimal Learner (M1)

ID	Risk	Sev	Impact	Mitigation	Owner	Dependency
S1-R1	Offline bundle tamper/device-binding bug	S1	Content piracy; license bypass	AES-256-GCM per-device derivation; JWS signing; tamper CI fixtures; bundle chaos tests	Content + Security	KMS per tenant; device cert
S1-R2	AI tutor hallucination at learner surface	S1	Wrong answers in compliance training; regulatory exposure	RAG over lesson context; refusal UX; citation of cited blocks; quarterly accuracy eval	AI + Learning	Prompt registry; eval harness
S1-R3	Local model quality gap	S2	Offline UX feels degraded	"Local model" badge; cloud-refresh CTA; quality eval per release	AI + Mobile Platform	Local-inference SDK
S1-R4	PlayPackage schema late freeze	S2	Player + Content-Packaging diverge	Freeze before M1 sprint 1; shared TS types	Content-Packaging + Learner	Block schema
S1-R5	License envelope expiry UX ambiguity	S2	Learners blocked without explanation	Clear countdown UX; proactive refresh on sync	Learner + Design	Sync service
S1-R6	Statement outbox overflow on long offline periods	S2	Lost statements	Chunked push; client-side caps; backpressure UX	Sync + Progress	IndexedDB quotas
S1-R7	Multi-device cursor resolution bug	S2	Learner confused about progress	`max(cursor)` reconciliation + tests; audit each reconciliation	Learner + Sync	Vector clock
S1-R8	Accessibility regressions on player	S2	WCAG 2.2 AA failure	axe in CI; manual NVDA + VoiceOver per release; reduced-motion toggle	Design + Learner FE	—
S1-R9	Capacitor ↔ web parity gaps	S3	Bugs appear only on mobile	Shared E2E fixtures; device farm tests	Mobile Platform + QA	Device farm
S1-R10	Design partners insufficient diversity	S3	Missed feedback from regulated/remote users	Curate partner cohort (regulated, field, multilingual)	PM + Sales	—

S2 — Authoring MVP + AI Co-Author MVP (M2 first half)

ID	Risk	Sev	Impact	Mitigation	Owner	Dependency
S2-R1	Publish saga half-failures	S1	Orphan CourseVersions, broken catalog	Explicit compensations; chaos tests at every step; admin queue; saga state machine tests	Platform + Authoring + Content	Saga infra
S2-R2	Block registry rushed	S1	Block kind shape churn + rework	Block schema RFC + freeze at M2 start; new kinds additive only	Authoring + Architecture	Block schema ADR
S2-R3	AI co-author accept rate low	S2	Low adoption, wasted AI spend	Prompt regression gate at 50 % accept; user-research cadence	AI + Authoring	Eval harness
S2-R4	Provenance UI complexity	S3	Admins ignore AI transparency	Progressive disclosure in UI; badge always visible	Design + Authoring FE	—
S2-R5	Media pipeline bottleneck on transcode	S2	Slow author feedback loop	Worker pool; backpressure UX; inline low-res preview	Media	—
S2-R6	Customer content shape surprises	S3	Real content breaks block validators	Partner beta with real content before freeze	Authoring + PM	—
S2-R7	Publish saga retries exhaust AI budget	S3	Unexpected AI cost spike	Idempotent AI calls; cache by prompt-hash; retry caps	AI + Authoring	—

S3 — Marketplace MVP (M2 second half)

ID	Risk	Sev	Impact	Mitigation	Owner	Dependency
S3-R1	Payment compliance gaps	S1	PCI incident; processor termination	Tokenized cards only; PCI scope minimized; processor-abstract ACL	Commerce + Security	Processor sandbox
S3-R2	Refund edge cases leak seats	S2	Provider disputes; partial refunds wrong	Refund policy DSL + unit-tested matrix; refund-after-seat-consumed rule	Commerce + Legal	—
S3-R3	SCORM 1.2 conformance regression	S2	3rd-party LMS rejects zips	SCORM Cloud in CI every build; fixture courses	Content-Packaging	SCORM Cloud account
S3-R4	Webhook replay storms from customers	S3	DLQ + alert fatigue	Backoff + DLQ + dashboards; per-subscription limits	Comms + SRE	—
S3-R5	Marketplace low-quality listings at launch	S2	Brand damage	AI moderation + human review; provider onboarding standards	Commerce + AI	Moderation pipeline
S3-R6	Purchase saga split-brain with licensing	S1	Payment without license or license without payment	Idempotent saga + compensations + reconciliation job	Commerce + Platform	—
S3-R7	Public certificate verify abused for scraping	S3	Data harvest	Rate limit + bot mitigation + verification-token TTL scheme	Certification + Security	—

S4 — Compliance + Enterprise (M3)

ID	Risk	Sev	Impact	Mitigation	Owner	Dependency
S4-R1	RRULE + timezone correctness	S2	Wrong due dates; compliance failures	1 000-fixture suite incl. DST + leap; TZ matrix tests	Enterprise	RRULE engine
S4-R2	SAML edge cases per IdP	S2	Enterprise deals stall	Test Okta, Azure AD, Google, custom ADFS, Auth0	Enterprise + Platform	IdP test accounts
S4-R3	ABAC policy complexity breeds mis-grants	S1	Data leak within tenant	Policy linter; sample-data tests; UI shows plain-language policy	Platform	ABAC DSL
S4-R4	AI grading fairness	S1	Discrimination claims	Bias eval; human override; EU AI Act high-risk docs; external audit	AI + Compliance	Eval corpus
S4-R5	PDF→course quality variable	S2	Authors reject AI output	Confidence thresholds; chunk-level accept/reject; fallback to outline-only	AI + Authoring	—
S4-R6	Recurrence storm (many tenants activate on same day)	S2	Notification burst + queue overload	Jitter materialization; batch send; backpressure	Enterprise + Comms	—
S4-R7	GDPR erasure saga drift	S1	Erasure incomplete; regulator risk	Every service declares participation; CI gate; saga replay tests	Platform + Compliance	GDPR saga contract
S4-R8	SCORM 2004 + xAPI conformance misses	S2	Regulated market rejections	ADL suite in CI; cmi5 profile tests	Content-Packaging	ADL LRS
S4-R9	Enterprise procurement delays	S2	Revenue slips	SOC 2 Type I + DPA + BAA templates ready; reference customers	Enterprise + Legal	SOC 2 auditor

S5 — Full Authoring + Offline Authoring (M4)

ID	Risk	Sev	Impact	Mitigation	Owner	Dependency
S5-R1	Offline authoring conflict UX	S1	Data loss perception	Pre-merge backup; side-by-side diff; AI merge suggestion; 30-day backup retention	Authoring + Sync	Conflict UI
S5-R2	Yjs doc corruption	S2	Collab session lost	Periodic snapshots; replay from event log; conflict repair tooling	Authoring	Yjs persistence
S5-R3	Live-collab latency across regions	S2	UX feels laggy	Regional WS endpoints; presence throttle; awareness compression	Authoring + SRE	—
S5-R4	AI image/TTS content-safety + copyright	S2	Legal exposure	Content-safety pipeline; provenance on every asset; copyright-risk classifier	Media + AI + Legal	—
S5-R5	LTI 1.3 interop quirks	S2	Embedding deals stall	LTI conformance tests; partner sandbox	Enterprise + Tenant	LTI tooling
S5-R6	Block taxonomy bloat	S2	Editor UX complexity	Governance board; usage telemetry; quarterly prune	Authoring + Design	—
S5-R7	Hybrid search ranker quality	S2	Low relevance	Eval with user-judged pairs; A/B ranker rollout	Data/AI + Search	—
S5-R8	AI translation errors on regulated terminology	S2	Legal risk	Per-tenant glossaries; reviewer required; legal-language flag	AI + Authoring	Glossary tooling

S6 — Scale + Advanced Insight + Mobile (M5)

ID	Risk	Sev	Impact	Mitigation	Owner	Dependency
S6-R1	Multi-region data residency migration bugs	S1	Data loss or cross-region leakage	Rehearsals on production-size fixture; checksum verification; rollback path; saga tests	Platform + SRE + all services	Residency saga
S6-R2	HIPAA provider allowlist enforcement	S1	BAA non-compliance	Tenant-tagged routing; CI gate on provider list; audit export	AI Services + Compliance	BAA contracts
S6-R3	Mobile native regressions from platform updates	S2	App-store rejection	Device farm; beta channel; staged rollout	Mobile Platform + QA	Device farm
S6-R4	Marketplace abuse at scale	S2	Brand damage; fraud loss	AI moderation v2; provider deposits; fraud-signal monitoring	Commerce + Security + AI	—
S6-R5	White-label CSP scoping bugs	S2	XSS across tenants	Per-tenant CSP + nonce; isolated subdomain + cookie scoping	Platform + Security	—
S6-R6	Developer SDK breaking-change temptations	S2	Integrator churn	Semver strictness; deprecation policy; communication channels	DevEx + PM	SDK governance
S6-R7	At-risk prediction model bias	S1	Unfair interventions	Quarterly bias eval; feature exclusion list; human-only override; opt-out	Data/AI + Compliance	Eval corpus
S6-R8	ISO 27001 certification scope mismatch	S2	Audit fail	Control mapping exercise early; internal audit pass	Compliance + SRE	Auditor

Slice-Independent / Cross-Cutting Risks

ID	Risk	Sev	Impact	Mitigation	Owner
X-R1	AI cost runaway	S1	Surprise bills	Per-tenant budgets + soft-degrade + hard-stop + alerts	AI Services + Finance
X-R2	Over-eager AI defaults reduce trust	S2	Users distrust product	Default OFF per tenant; per-feature opt-in; transparent provenance	AI + Design
X-R3	Schema drift across services	S2	Pact breakage	Schema registry; CI gate; weekly producer review	Platform
X-R4	Solo on-call burnout	S2	Incident response quality drops	Rotation; buddy system; post-incident reviews weekly	SRE
X-R5	Regional compliance surprises	S2	Launch blockers	Legal-review per geo before launch	Legal + PM
X-R6	Pilot feedback overwrites roadmap	S3	Scope creep	PM triage; feedback lands in backlog with slice assignment	PM

Governance

Weekly risk review: each team owner presents new, changed, or closed risks.
Quarterly architecture risk review: top 10 cross-cutting risks reviewed by CTO + architecture.
Every S1 risk has a named owner, a due date for mitigation, and a verification plan.
Acceptance criteria for S1/S2 risks to be "closed" includes: mitigation shipped + metric(s) monitored + post-mitigation verification test documented.

S0 — Platform Foundation​

S1 — Minimal Learner (M1)​

S2 — Authoring MVP + AI Co-Author MVP (M2 first half)​

S3 — Marketplace MVP (M2 second half)​

S4 — Compliance + Enterprise (M3)​

S5 — Full Authoring + Offline Authoring (M4)​

S6 — Scale + Advanced Insight + Mobile (M5)​

Slice-Independent / Cross-Cutting Risks​

Governance​