Testing
:::info Source
Sourced from services/assessment-service/TESTING_STRATEGY.md in the documentation repo.
:::
1. Coverage Targets
Domain 95% line / 98% branch / 80% mutation. Integration 80% line.
2. Unit Tests
GradingRuleevaluation: pass/fail thresholds, weighted scoring.QuizBankinvariants: unique question IDs, at least one correct option per MCQ.BranchingScenarioinvariants: DAG (no cycles), max depth 50, reachable terminal nodes.AttemptResultintegrity hash computation; roundtrip verification.RubricCriterionAI vs human agreement threshold.- Randomization seed determinism: same
(attemptId, poolId)→ same order.
3. Integration Tests
- Postgres + NATS + mock AI gateway.
- Flow: create quiz → serve randomized → submit responses → score → emit event.
- Flow: AI-generate question → review → accept → publish.
- Flow: AI rubric grade → below confidence → routes to human reviewer.
- Flow: branching scenario → 5-level deep navigation → reach terminal → score computed.
4. Contract Tests
- assessment → progress:
assessment.attempt_result.scored.v1Pact. - assessment ← authoring: consumes
authoring.block.added.v1for QuizBlockRef. - OpenAPI diff in CI.
5. E2E
- J-08: author creates quiz + AI-assisted → publishes → learner takes → scored → progress event.
- J-09: branching scenario with 5 paths → each path exercised.
- J-10: AI rubric grading → human override → learner appeal → resolution.
6. Load Tests
- 10k concurrent quiz-serve requests → p95 < 200ms.
- 1k/sec scoring throughput.
- AI rubric grading: 100 concurrent; queue stable.
7. Chaos
- Fail AI gateway → local model fallback for question generation; degraded UX for rubric grading.
- Corrupt answer key blob → detected on decrypt; attempt fails with diagnostic.
- Kill pod mid-scoring → resume via idempotency.
8. Offline Tests
- Bundle with encrypted quiz + answer key mounts offline.
- Learner takes quiz offline; scoring local; result queued.
- Reconnect → result synced; server validates integrity hash.
- Tampered bundle → cannot decrypt; scoring fails offline; event queued.
9. AI Safety Tests
- Red-team corpus: prompt injection in lesson content → generator refuses.
- PII in response → redacted before grading.
- Bias eval: grade demographic-parity on sample corpus.
- Regression: every prompt version bumps require eval pass.
10. Security Tests
- Cross-tenant quiz access attempt → 403.
- Answer key in response payload → CI grep catches.
- Replay submission → idempotent.
- Attempt integrity hash tampered → rejected.
11. Branching DAG Tests
- Random-scenario property tests: any generated scenario must be a valid DAG.
- Cycle detection test fixtures.
12. CI Gates
- Unit + integration green.
- AI prompt eval pass.
- Two-tenant iso green.
- Mutation ≥ 80% on domain.
- OpenAPI diff + Pact verified.