Testing
:::info Source
Sourced from services/delivery-service/TESTING_STRATEGY.md in the documentation repo.
:::
Companion: 16 Testing Strategy QA · APPLICATION_LOGIC
1. Scope & Coverage Target
- Overall coverage: ≥ 85% (platform minimum is 80%; delivery is Core domain, bumped)
- Domain layer coverage: ≥ 95%
- Application layer coverage: ≥ 90%
- Infrastructure layer coverage: ≥ 75%
2. Test Pyramid
┌─────────────┐
│ E2E (5%) │ Critical user journeys (offline->online, tutor flow)
├─────────────┤
│ Contract(10%)│ Provider/consumer contract tests
├─────────────┤
│Integration │ Service + DB + NATS + Redis (25%)
│ (25%) │
├─────────────┤
│ Unit │ Domain + use cases + adapters (60%)
│ (60%) │
└─────────────┘
3. Unit Tests
3.1 Domain Layer
Pure TypeScript, zero infrastructure dependencies. Run with vitest or jest.
Coverage targets:
- Every aggregate state transition
- Every invariant violation
- Every value object construction path
- Every domain service decision path
Examples:
test('PlaySession rejects navigation when in paused state', () => {
// Arrange
const session = makePlaySession({ state: 'paused' });
// Act + Assert
expect(() => session.navigate(next)).toThrow(InvalidStateError);
});
test('NavigationService advances to next lesson on next() with no prerequisites', () => {
// Arrange
const manifest = makeManifest();
const cursor = { moduleId: 'm1', lessonId: 'l1', sequenceIndex: 0 };
// Act
const result = NavigationService.resolve(cursor, { type: 'next' }, manifest);
// Assert
expect(result.lessonId).toBe('l2');
expect(result.sequenceIndex).toBe(1);
});
test('PlaySession completion requires all required lessons visited', () => {
// Arrange
const session = makePlaySession({
state: 'active',
lessonsVisited: ['l1', 'l2'] // missing l3
});
// Act + Assert
expect(() => session.complete(manifest)).toThrow(CompletionRequirementsUnmetError);
});
3.2 Application Layer
Use case handlers with mocked ports.
test('StartPlaySessionHandler emits event and persists aggregate', async () => {
// Arrange
const deps = makeHandlerDeps({
enrollment: { status: 'active' }
});
const handler = new StartPlaySessionHandler(deps);
// Act
await handler.handle({ enrollmentId, courseVersionId, deviceId });
// Assert
expect(deps.repo.save).toHaveBeenCalledOnce();
expect(deps.eventPublisher.publish).toHaveBeenCalledWith(
expect.objectContaining({ type: 'delivery.play_session.started.v1' })
);
});
4. Integration Tests
Run against Testcontainers (Postgres + Redis + NATS JetStream). Each test runs with isolated schema.
Covers:
- Repository implementations against real Postgres
- RLS enforcement (tenant isolation)
- Outbox + inbox integration
- Event publishing and consumption via NATS
- Redis caching behavior
describe('PlaySessionRepository integration', () => {
it('enforces RLS: cannot read session from different tenant', async () => {
// Arrange
await setAppTenantId(tenant1);
const saved = await repo.save(makeSession({ tenantId: tenant1 }));
// Act
await setAppTenantId(tenant2);
const result = await repo.findById(saved.id);
// Assert
expect(result).toBeNull();
});
});
5. Contract Tests
5.1 API Contract (OpenAPI)
- Schema derived from NestJS decorators.
- Validated against OpenAPI 3.1 spec via
openapi-examples-validator. - Consumer-driven contracts with frontend and mobile clients via Pact.
5.2 Event Contract
- JSON Schema per event type in
event-schemas/. - Producer: every event published is validated pre-emit.
- Consumer: every event consumed is validated pre-handle.
- Contract tests verify delivery's produced events match the schemas consumed by progress-service, analytics-service, sync-service.
5.3 Provider Tests
Delivery is a provider for:
- Web/mobile clients (REST + SSE)
- Internal services (via NATS events)
Provider tests run in CI whenever schemas change.
6. E2E Tests (Playwright + API)
Critical journeys:
| Journey | Description |
|---|---|
player-e2e-01-basic-playback | Learner starts course, navigates through lessons, completes. Verifies correct events emitted to progress-service. |
player-e2e-02-tutor-flow | Learner starts session, asks AI tutor 3 questions, rates responses. Verifies tutor turns persisted. |
player-e2e-03-offline-mount-online-sync | Mount bundle offline, simulate offline navigation, reconnect, verify sync-service reconciles. |
player-e2e-04-tamper-response | Tamper with local bundle, verify force-unmount propagates. |
player-e2e-05-scorm-runtime | SCORM 2004 course completion flow (S4 slice). |
player-e2e-06-branching | Branching scenario with multiple paths; verify cursor tracks correctly. |
E2E suite runs nightly on staging + on every merge to main. Target runtime < 20 min.
7. Offline Testing
Delivery is offline-critical. Dedicated test matrix:
| Scenario | Test |
|---|---|
| Start session offline | Client-driven test with network disabled in Playwright |
| Navigate offline | Verify local state correctly reflects server-side session after reconnect |
| Tutor turn offline | Verify local AI model produces response; verify aiProvenance.local = true |
| Bundle tamper during offline | Simulate corrupted bundle; verify runtime rejects |
| Clock skew | Device clock drifted by 30 min; verify session still syncs |
| Multi-device offline | Device A and B offline with same session; reconnect; verify conflict resolution |
| Storage full | Verify graceful handling when local storage exhausted |
8. AI Testing
8.1 Prompt Regression
Golden-set of 50 tutor prompts with expected quality bands. Run against current model weekly:
- Relevance score (on-topic)
- Accuracy (facts from lesson)
- Safety (no toxicity, no PII in output)
8.2 Safety
- Prompt injection test set (100+ adversarial prompts)
- Curriculum drift test set (off-topic queries)
- Expected: ai-gateway blocks; delivery returns fallback
8.3 Local Model Quality
Comparison between cloud and local model on golden set. Acceptable quality degradation: local must achieve ≥ 70% of cloud quality.
9. Load & Performance
9.1 Load Profile
| Scenario | Target |
|---|---|
| Concurrent active sessions | 100,000 / tenant |
| Navigation events | 10,000 / sec platform-wide |
| Tutor turns | 1,000 / sec platform-wide |
| Session starts | 500 / sec platform-wide |
9.2 Tooling
- k6 for HTTP load
- nats-bench for NATS throughput
9.3 SLOs Under Load
- Navigation p95 < 300ms at peak
- Tutor turn TTFT p95 < 1.5s at peak
- Zero data loss under 10% packet loss
10. Chaos & Resilience
| Experiment | Frequency |
|---|---|
| Kill random pod | Weekly |
| Database failover | Monthly |
| Redis failover | Monthly |
| NATS partition | Monthly |
| AI gateway down | Weekly (should gracefully degrade) |
| Content service down | Weekly (manifest cache should serve) |
All chaos experiments run in staging. Production chaos runs are feature-flagged and opt-in per tenant.
11. Security Testing
- SAST: Semgrep on every PR (no HIGH findings)
- DAST: OWASP ZAP against staging nightly
- Dependency scan: Snyk +
npm auditon every PR - Two-tenant simulator: Integration test asserting RLS on every endpoint
- JWT fuzzing: Weekly
- License envelope fuzzing: Weekly
- Pen test: Annual + after major releases
12. Replay & Rebuild
Given 04 Event-Driven §14, delivery must support rebuilding read models from events:
| Test | Purpose |
|---|---|
rebuild-session-state-from-events | From a session's event history, reconstruct the same PlaySession final state |
rebuild-gate-status-from-events | Rebuild gate_status projection from assessment events |
Replay tests run weekly in staging.
13. CI/CD Quality Gates
All gates must pass before merge to main:
| Gate | Tool |
|---|---|
| Lint | ESLint + Prettier |
| Type check | tsc --noEmit |
| Unit tests | vitest (coverage ≥ 85%) |
| Integration tests | vitest + Testcontainers |
| Contract tests | Pact |
| SAST | Semgrep |
| Dependency scan | Snyk |
| OpenAPI validation | openapi-examples-validator |
| Two-tenant simulator | custom harness |
| Build | docker build |
Merge queue enforced; PRs merge in order with fresh rebase tests.
14. Test Data Management
- Factories:
@test/factories/delivery— produces valid aggregates, events, DTOs - Seed data:
db/seeds/delivery— loaded into staging DB for manual testing - PII-free: All test data uses synthetic names +
@example.comemails