Skip to main content

Observability

:::info Source Sourced from services/assessment-service/OBSERVABILITY.md in the documentation repo. :::

1. Logs

Events: assessment.quiz.served, .response.submitted, .attempt.scored, .ai.question.generated, .ai.rubric.graded, .grade.overridden, .appeal.submitted, .scenario.navigated.

Attrs: quiz_bank_id, attempt_id, question_id, scaled_score, passed, scoring_duration_ms. Redact responses.

2. Metrics

RED

  • assessment_api_requests_total{endpoint,status} — counter
  • assessment_api_duration_seconds{endpoint} — histogram

Domain

  • assessment_attempts_scored_total{outcome} — counter
  • assessment_scoring_duration_seconds — histogram (target p95 < 500ms)
  • assessment_ai_grade_confidence — histogram
  • assessment_ai_grade_override_rate{instructor_id_hash} — gauge
  • assessment_appeal_rate — gauge
  • assessment_questions_generated_total{accepted:bool} — counter
  • assessment_answer_key_decrypt_latency_seconds — histogram

3. Traces

Spans: assessment.score_attempt, assessment.grade_rubric_with_ai, assessment.generate_question, assessment.branch.navigate.

4. Dashboards

  • Scoring throughput + latency.
  • AI grading: confidence distribution, override rate, appeal rate.
  • Quiz authoring: AI accept-rate per author.
  • Branching scenario: completion rate by scenario, path distribution.

5. Alerts

AlertThresholdSeverity
scoring-latency-highp95 > 2sP2
ai-override-rate-spike> 30% override for a prompt versionP2
appeal-rate-spike> 5% appeals for a quizP3
answer-key-decrypt-failure> 1/minP1
dlq-non-emptyanyP2

6. SLOs

SLITarget
Quiz serve p95< 200ms
Scoring p95< 500ms
AI grade p95< 3s
Scoring success rate99.99%

7. RUM

  • Quiz page LCP < 1.5s online; < 600ms offline.
  • Response submission INP < 200ms.