Skip to main content

Testing

:::info Source Sourced from services/assignment-service/TESTING_STRATEGY.md in the documentation repo. :::

Companion: 16 Testing Strategy & QA


1. Test Pyramid

┌────────────┐
│ E2E │ ~30 scenarios, 10%
├────────────┤
│ Contract │ Event + HTTP, 15%
├────────────┤
│ Integration│ DB + NATS + outbox, 25%
├────────────┤
│ Unit │ Pure domain, 50%
└────────────┘

Coverage gate: line 80%, branch 75% — fails CI below threshold.

2. Unit Tests

Scope: domain logic, value objects, RRULE engine, state machines, target resolver, escalation evaluator, reminder planner.

ModuleKey cases
RRULEEngineYEARLY/MONTHLY/WEEKLY frequencies, BYMONTHDAY, BYDAY, EXDATE, TZ (including DST transitions), COUNT cap, UNTIL cap
AssignmentAggregateall state transitions + invalid transitions; invariant enforcement; draft vs active immutable fields
ComplianceWindowAggregatestate transitions including overdue→completed (late flag), overdue→closed_missed
EscalationEvaluatorsteps fire in order; idempotency via (windowId, level)
ReminderPlannerrelative_to_due, on_due, relative_to_overdue; suppression-when-in-progress
TargetResolveruser, org_unit+descendants, dynamic_group; empty set; de-duplication

Framework: Vitest + fast-check for property-based tests on RRULE and state-machine transitions.

3. Integration Tests

Run against real Postgres (Testcontainers) and ephemeral NATS JetStream server.

ScenarioVerifies
Create→activate→materializeRows + outbox + schema
Enrollment event → in_progressConsumer handler + idempotency
Progress completion → completedSame
Overdue sweeperPartial-index scan; produces correct events
RLS isolationSession A cannot read B's tenant
Migration upAll migrations apply cleanly
Partition add for new tenantProvisioner creates partition + RLS policy

Testcontainers boot under 8 s on CI (cached image).

4. Contract Tests

4.1 Event Schema Contracts

  • Every event emitted in a test is validated against @ghasi/event-schemas/assignment/*.v1.json.
  • Provider-side Pact: assignment-service publishes for enrollment-service, notification-service, analytics-service. CI fails if we break an event consumer's expectation without a coordinated version bump.

4.2 HTTP OpenAPI Contracts

  • dredd / prism runs every OpenAPI example through a live service instance.
  • Schema drift CI gate: generated OpenAPI from code ≡ committed spec.

5. E2E Tests

End-to-end through the platform using shared harness (@ghasi/e2e):

FlowServices involved
Admin creates recurring assignment, learner completesassignment, tenant, enrollment, progress, notification
Admin creates one-shot assignment, learner misses → closed_missedassignment, enrollment, progress, notification, analytics
AI suggests assignment, admin accepts, activation succeedsassignment, ai-gateway, tenant, catalog
Tenant dynamic group changes mid-flight → rebindtenant, assignment
GDPR erasure for learner with active windowsidentity, assignment

E2E uses Playwright for UI flows + direct HTTP for admin and saga inspection.

6. Load / Performance

Tool: k6. Scenarios:

ScenarioTarget
Create-activate cycle100 rps, p95 < 400 ms
List windows (100k rows)200 rps, p95 < 200 ms
Compliance report 100k windowsp95 < 1.5 s
Materializer stress: 10k users × 100 occurrencesfinishes < 60 s, memory < 1 GB
Overdue sweep on 5M windowscompletes within 15 min

Ran nightly on staging with synthetic tenant.

7. Chaos / Failure Injection

Using LitmusChaos or Chaos Mesh:

  • Kill one pod during materialization — expect resume from cursor, zero duplicate windows.
  • Partition NATS — expect outbox to buffer; no data loss on reconnect.
  • Force Postgres failover — expect reads degrade, writes retry; SLO breach alert fires.
  • Inject 500ms DB latency — p95 budget breach tested; no runaway timeouts.

8. Security Tests

  • Semgrep rules for RLS usage (every query must run with tenant context).
  • Burp Suite / ZAP against staging every release.
  • Fuzzing: RRULE parser fuzzed via go-fuzz-style harness on rrule inputs.
  • Tenant-leak red-team test in E2E: forge X-Tenant-Id and expect 403.

9. AI Eval Harness

See AI_INTEGRATION §9:

  • 30+ labelled (context, expected proposal) pairs.
  • Runs in CI on changes to prompt or client.
  • Regression report attached to PR.

10. Offline / Sync Tests

Scripted flows in @ghasi/sync-harness:

  • Learner device offline 5 days; windows state changes on server; reconcile on reconnect → exactly the server state wins.
  • Admin device edits draft offline; server edits same draft; reconnect → LWW + conflicts list.

11. Mutation Testing

Stryker run weekly on domain package. Mutation score goal: ≥ 70% on packages/domain.

12. Test Data

  • Factories in test/fixtures/ producing realistic Assignments and Windows with deterministic ULIDs.
  • seed-tenant.ts script provisions a demo tenant with 10k users, 5 assignments, 100k windows for perf dashboards.

13. Branches / PR Gates

GateRequired
Unit + integration pass
Coverage ≥ 80% line
OpenAPI drift check
Event schema compat check
Security lint (semgrep)
k6 smoke (5 min)
Pact provider verification
E2E happy pathon release-candidate branch

14. Staging Soak

Every release spends 48 h in staging with full synthetic traffic before production.

15. Rollback Verification

Automated: every deploy runs a rollback-preview job that executes the prior image against the new DB schema and confirms compatibility (backward-compat schema only allowed).