Testing
:::info Source
Sourced from services/assignment-service/TESTING_STRATEGY.md in the documentation repo.
:::
Companion: 16 Testing Strategy & QA
1. Test Pyramid
┌────────────┐
│ E2E │ ~30 scenarios, 10%
├────────────┤
│ Contract │ Event + HTTP, 15%
├────────────┤
│ Integration│ DB + NATS + outbox, 25%
├────────────┤
│ Unit │ Pure domain, 50%
└────────────┘
Coverage gate: line 80%, branch 75% — fails CI below threshold.
2. Unit Tests
Scope: domain logic, value objects, RRULE engine, state machines, target resolver, escalation evaluator, reminder planner.
| Module | Key cases |
|---|---|
RRULEEngine | YEARLY/MONTHLY/WEEKLY frequencies, BYMONTHDAY, BYDAY, EXDATE, TZ (including DST transitions), COUNT cap, UNTIL cap |
AssignmentAggregate | all state transitions + invalid transitions; invariant enforcement; draft vs active immutable fields |
ComplianceWindowAggregate | state transitions including overdue→completed (late flag), overdue→closed_missed |
EscalationEvaluator | steps fire in order; idempotency via (windowId, level) |
ReminderPlanner | relative_to_due, on_due, relative_to_overdue; suppression-when-in-progress |
TargetResolver | user, org_unit+descendants, dynamic_group; empty set; de-duplication |
Framework: Vitest + fast-check for property-based tests on RRULE and state-machine transitions.
3. Integration Tests
Run against real Postgres (Testcontainers) and ephemeral NATS JetStream server.
| Scenario | Verifies |
|---|---|
| Create→activate→materialize | Rows + outbox + schema |
| Enrollment event → in_progress | Consumer handler + idempotency |
| Progress completion → completed | Same |
| Overdue sweeper | Partial-index scan; produces correct events |
| RLS isolation | Session A cannot read B's tenant |
| Migration up | All migrations apply cleanly |
| Partition add for new tenant | Provisioner creates partition + RLS policy |
Testcontainers boot under 8 s on CI (cached image).
4. Contract Tests
4.1 Event Schema Contracts
- Every event emitted in a test is validated against
@ghasi/event-schemas/assignment/*.v1.json. - Provider-side Pact: assignment-service publishes for
enrollment-service,notification-service,analytics-service. CI fails if we break an event consumer's expectation without a coordinated version bump.
4.2 HTTP OpenAPI Contracts
dredd/prismruns every OpenAPI example through a live service instance.- Schema drift CI gate: generated OpenAPI from code ≡ committed spec.
5. E2E Tests
End-to-end through the platform using shared harness (@ghasi/e2e):
| Flow | Services involved |
|---|---|
| Admin creates recurring assignment, learner completes | assignment, tenant, enrollment, progress, notification |
| Admin creates one-shot assignment, learner misses → closed_missed | assignment, enrollment, progress, notification, analytics |
| AI suggests assignment, admin accepts, activation succeeds | assignment, ai-gateway, tenant, catalog |
| Tenant dynamic group changes mid-flight → rebind | tenant, assignment |
| GDPR erasure for learner with active windows | identity, assignment |
E2E uses Playwright for UI flows + direct HTTP for admin and saga inspection.
6. Load / Performance
Tool: k6. Scenarios:
| Scenario | Target |
|---|---|
| Create-activate cycle | 100 rps, p95 < 400 ms |
| List windows (100k rows) | 200 rps, p95 < 200 ms |
| Compliance report 100k windows | p95 < 1.5 s |
| Materializer stress: 10k users × 100 occurrences | finishes < 60 s, memory < 1 GB |
| Overdue sweep on 5M windows | completes within 15 min |
Ran nightly on staging with synthetic tenant.
7. Chaos / Failure Injection
Using LitmusChaos or Chaos Mesh:
- Kill one pod during materialization — expect resume from cursor, zero duplicate windows.
- Partition NATS — expect outbox to buffer; no data loss on reconnect.
- Force Postgres failover — expect reads degrade, writes retry; SLO breach alert fires.
- Inject 500ms DB latency — p95 budget breach tested; no runaway timeouts.
8. Security Tests
- Semgrep rules for RLS usage (every query must run with tenant context).
- Burp Suite / ZAP against staging every release.
- Fuzzing: RRULE parser fuzzed via
go-fuzz-style harness on rrule inputs. - Tenant-leak red-team test in E2E: forge
X-Tenant-Idand expect 403.
9. AI Eval Harness
See AI_INTEGRATION §9:
- 30+ labelled
(context, expected proposal)pairs. - Runs in CI on changes to prompt or client.
- Regression report attached to PR.
10. Offline / Sync Tests
Scripted flows in @ghasi/sync-harness:
- Learner device offline 5 days; windows state changes on server; reconcile on reconnect → exactly the server state wins.
- Admin device edits draft offline; server edits same draft; reconnect → LWW + conflicts list.
11. Mutation Testing
Stryker run weekly on domain package. Mutation score goal: ≥ 70% on packages/domain.
12. Test Data
- Factories in
test/fixtures/producing realistic Assignments and Windows with deterministic ULIDs. seed-tenant.tsscript provisions a demo tenant with 10k users, 5 assignments, 100k windows for perf dashboards.
13. Branches / PR Gates
| Gate | Required |
|---|---|
| Unit + integration pass | ✅ |
| Coverage ≥ 80% line | ✅ |
| OpenAPI drift check | ✅ |
| Event schema compat check | ✅ |
| Security lint (semgrep) | ✅ |
| k6 smoke (5 min) | ✅ |
| Pact provider verification | ✅ |
| E2E happy path | on release-candidate branch |
14. Staging Soak
Every release spends 48 h in staging with full synthetic traffic before production.
15. Rollback Verification
Automated: every deploy runs a rollback-preview job that executes the prior image against the new DB schema and confirms compatibility (backward-compat schema only allowed).