Failure Modes
:::info Source
Sourced from services/catalog-service/FAILURE_MODES.md in the documentation repo.
:::
1. Scenarios
1.1 Duplicate CourseVersion Registration
- Mitigation: unique (courseId, versionLabel); idempotent insert.
1.2 Taxonomy Tree Corruption
- Mitigation: tree invariants at write; nightly integrity job.
1.3 Missing PlayPackage Reference
- Cause: race between content + catalog.
- Mitigation: validate
playPackageRefresolvable; retry with backoff.
1.4 Slug Collision
- Mitigation: UNIQUE constraint; user sees error with suggested alternatives.
1.5 Withdrawal Cascade Delay
- Cause: search/marketplace lag consuming withdraw event.
- Mitigation: event-driven; typically < 60s; worst-case explicit re-publish.
2. Retry / Backoff
| Op | Max | Backoff |
|---|---|---|
| Postgres write | 3 | 10ms–200ms |
| Outbox | infinite | exp cap 5m |
3. Fallbacks
- If CDN cache stale, direct Postgres read.
4. Chaos
- Duplicate publish event → single version registered.
- Event order jumble → idempotent final state.