Testing

:::info Source Sourced from services/delivery-service/TESTING_STRATEGY.md in the documentation repo. :::

Companion: 16 Testing Strategy QA · APPLICATION_LOGIC

1. Scope & Coverage Target

Overall coverage: ≥ 85% (platform minimum is 80%; delivery is Core domain, bumped)
Domain layer coverage: ≥ 95%
Application layer coverage: ≥ 90%
Infrastructure layer coverage: ≥ 75%

2. Test Pyramid

          ┌─────────────┐
          │  E2E (5%)    │  Critical user journeys (offline->online, tutor flow)
          ├─────────────┤
          │ Contract(10%)│  Provider/consumer contract tests
          ├─────────────┤
          │Integration   │  Service + DB + NATS + Redis (25%)
          │    (25%)     │
          ├─────────────┤
          │    Unit      │  Domain + use cases + adapters (60%)
          │    (60%)     │
          └─────────────┘

3. Unit Tests

3.1 Domain Layer

Pure TypeScript, zero infrastructure dependencies. Run with vitest or jest.

Coverage targets:

Every aggregate state transition
Every invariant violation
Every value object construction path
Every domain service decision path

Examples:

test('PlaySession rejects navigation when in paused state', () => {
  // Arrange
  const session = makePlaySession({ state: 'paused' });

  // Act + Assert
  expect(() => session.navigate(next)).toThrow(InvalidStateError);
});

test('NavigationService advances to next lesson on next() with no prerequisites', () => {
  // Arrange
  const manifest = makeManifest();
  const cursor = { moduleId: 'm1', lessonId: 'l1', sequenceIndex: 0 };

  // Act
  const result = NavigationService.resolve(cursor, { type: 'next' }, manifest);

  // Assert
  expect(result.lessonId).toBe('l2');
  expect(result.sequenceIndex).toBe(1);
});

test('PlaySession completion requires all required lessons visited', () => {
  // Arrange
  const session = makePlaySession({
    state: 'active',
    lessonsVisited: ['l1', 'l2']    // missing l3
  });

  // Act + Assert
  expect(() => session.complete(manifest)).toThrow(CompletionRequirementsUnmetError);
});

3.2 Application Layer

Use case handlers with mocked ports.

test('StartPlaySessionHandler emits event and persists aggregate', async () => {
  // Arrange
  const deps = makeHandlerDeps({
    enrollment: { status: 'active' }
  });
  const handler = new StartPlaySessionHandler(deps);

  // Act
  await handler.handle({ enrollmentId, courseVersionId, deviceId });

  // Assert
  expect(deps.repo.save).toHaveBeenCalledOnce();
  expect(deps.eventPublisher.publish).toHaveBeenCalledWith(
    expect.objectContaining({ type: 'delivery.play_session.started.v1' })
  );
});

4. Integration Tests

Run against Testcontainers (Postgres + Redis + NATS JetStream). Each test runs with isolated schema.

Covers:

Repository implementations against real Postgres
RLS enforcement (tenant isolation)
Outbox + inbox integration
Event publishing and consumption via NATS
Redis caching behavior

describe('PlaySessionRepository integration', () => {
  it('enforces RLS: cannot read session from different tenant', async () => {
    // Arrange
    await setAppTenantId(tenant1);
    const saved = await repo.save(makeSession({ tenantId: tenant1 }));

    // Act
    await setAppTenantId(tenant2);
    const result = await repo.findById(saved.id);

    // Assert
    expect(result).toBeNull();
  });
});

5. Contract Tests

5.1 API Contract (OpenAPI)

Schema derived from NestJS decorators.
Validated against OpenAPI 3.1 spec via openapi-examples-validator.
Consumer-driven contracts with frontend and mobile clients via Pact.

5.2 Event Contract

JSON Schema per event type in event-schemas/.
Producer: every event published is validated pre-emit.
Consumer: every event consumed is validated pre-handle.
Contract tests verify delivery's produced events match the schemas consumed by progress-service, analytics-service, sync-service.

5.3 Provider Tests

Delivery is a provider for:

Web/mobile clients (REST + SSE)
Internal services (via NATS events)

Provider tests run in CI whenever schemas change.

6. E2E Tests (Playwright + API)

Critical journeys:

Journey	Description
`player-e2e-01-basic-playback`	Learner starts course, navigates through lessons, completes. Verifies correct events emitted to progress-service.
`player-e2e-02-tutor-flow`	Learner starts session, asks AI tutor 3 questions, rates responses. Verifies tutor turns persisted.
`player-e2e-03-offline-mount-online-sync`	Mount bundle offline, simulate offline navigation, reconnect, verify sync-service reconciles.
`player-e2e-04-tamper-response`	Tamper with local bundle, verify force-unmount propagates.
`player-e2e-05-scorm-runtime`	SCORM 2004 course completion flow (S4 slice).
`player-e2e-06-branching`	Branching scenario with multiple paths; verify cursor tracks correctly.

E2E suite runs nightly on staging + on every merge to main. Target runtime < 20 min.

7. Offline Testing

Delivery is offline-critical. Dedicated test matrix:

Scenario	Test
Start session offline	Client-driven test with network disabled in Playwright
Navigate offline	Verify local state correctly reflects server-side session after reconnect
Tutor turn offline	Verify local AI model produces response; verify `aiProvenance.local = true`
Bundle tamper during offline	Simulate corrupted bundle; verify runtime rejects
Clock skew	Device clock drifted by 30 min; verify session still syncs
Multi-device offline	Device A and B offline with same session; reconnect; verify conflict resolution
Storage full	Verify graceful handling when local storage exhausted

8. AI Testing

8.1 Prompt Regression

Golden-set of 50 tutor prompts with expected quality bands. Run against current model weekly:

Relevance score (on-topic)
Accuracy (facts from lesson)
Safety (no toxicity, no PII in output)

8.2 Safety

Prompt injection test set (100+ adversarial prompts)
Curriculum drift test set (off-topic queries)
Expected: ai-gateway blocks; delivery returns fallback

8.3 Local Model Quality

Comparison between cloud and local model on golden set. Acceptable quality degradation: local must achieve ≥ 70% of cloud quality.

9. Load & Performance

9.1 Load Profile

Scenario	Target
Concurrent active sessions	100,000 / tenant
Navigation events	10,000 / sec platform-wide
Tutor turns	1,000 / sec platform-wide
Session starts	500 / sec platform-wide

9.2 Tooling

k6 for HTTP load
nats-bench for NATS throughput

9.3 SLOs Under Load

Navigation p95 < 300ms at peak
Tutor turn TTFT p95 < 1.5s at peak
Zero data loss under 10% packet loss

10. Chaos & Resilience

Experiment	Frequency
Kill random pod	Weekly
Database failover	Monthly
Redis failover	Monthly
NATS partition	Monthly
AI gateway down	Weekly (should gracefully degrade)
Content service down	Weekly (manifest cache should serve)

All chaos experiments run in staging. Production chaos runs are feature-flagged and opt-in per tenant.

11. Security Testing

SAST: Semgrep on every PR (no HIGH findings)
DAST: OWASP ZAP against staging nightly
Dependency scan: Snyk + npm audit on every PR
Two-tenant simulator: Integration test asserting RLS on every endpoint
JWT fuzzing: Weekly
License envelope fuzzing: Weekly
Pen test: Annual + after major releases

12. Replay & Rebuild

Given 04 Event-Driven §14, delivery must support rebuilding read models from events:

Test	Purpose
`rebuild-session-state-from-events`	From a session's event history, reconstruct the same PlaySession final state
`rebuild-gate-status-from-events`	Rebuild gate_status projection from assessment events

Replay tests run weekly in staging.

13. CI/CD Quality Gates

All gates must pass before merge to main:

Gate	Tool
Lint	ESLint + Prettier
Type check	tsc --noEmit
Unit tests	vitest (coverage ≥ 85%)
Integration tests	vitest + Testcontainers
Contract tests	Pact
SAST	Semgrep
Dependency scan	Snyk
OpenAPI validation	openapi-examples-validator
Two-tenant simulator	custom harness
Build	docker build

Merge queue enforced; PRs merge in order with fresh rebase tests.

14. Test Data Management

Factories: @test/factories/delivery — produces valid aggregates, events, DTOs
Seed data: db/seeds/delivery — loaded into staging DB for manual testing
PII-free: All test data uses synthetic names + @example.com emails

1. Scope & Coverage Target​

2. Test Pyramid​

3. Unit Tests​

3.1 Domain Layer​

3.2 Application Layer​

4. Integration Tests​

5. Contract Tests​

5.1 API Contract (OpenAPI)​

5.2 Event Contract​

5.3 Provider Tests​

6. E2E Tests (Playwright + API)​

7. Offline Testing​

8. AI Testing​

8.1 Prompt Regression​

8.2 Safety​

8.3 Local Model Quality​

9. Load & Performance​

9.1 Load Profile​

9.2 Tooling​

9.3 SLOs Under Load​

10. Chaos & Resilience​

11. Security Testing​

12. Replay & Rebuild​

13. CI/CD Quality Gates​

14. Test Data Management​