Radiology Service — Observability
Status: populated Owner: TBD Last updated: 2026-04-18 Companion: Service Template · 03 platform-services · 02 DDD
1. SLIs and SLOs
| SLI | Target SLO | Measurement |
|---|---|---|
| Study list retrieval p95 latency | < 2 s | http_request_duration_seconds p95 for GET /v1/radiology/studies |
| Viewer launch p95 latency | < 3 s | http_request_duration_seconds p95 for POST /studies/:id/viewer-launch |
| Report sign success rate | ≥ 99.5% | rad_report_sign_total{outcome="success"} / total |
| Critical finding event publish latency | < 10 s | rad_critical_publish_duration_seconds |
| PACS QIDO-RS availability | ≥ 95% | rad_pacs_qido_success_rate |
2. Key Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
rad_studies_registered_total | Counter | tenant_id, modality | Studies registered |
rad_reports_signed_total | Counter | tenant_id | Final reports signed |
rad_viewer_launches_total | Counter | tenant_id, viewer_type | Viewer launches |
rad_critical_findings_total | Counter | tenant_id, modality | Critical findings |
rad_pacs_qido_duration_seconds | Histogram | tenant_id, endpoint_id | PACS query latency |
rad_outbox_unpublished_count | Gauge | tenant_id | Queued events |
3. Dashboards
| Dashboard | Key panels |
|---|---|
| Radiology Operations | Studies/day by modality, report TAT, viewer launches/hr |
| PACS Integration | QIDO-RS latency, success/failure rate per endpoint |
| Critical Findings | Count, time-to-notification distribution |
| Outbox Health | Unpublished events, relay failures |
4. Alerts
| Alert | Condition | Severity | Runbook |
|---|---|---|---|
| PACS endpoint unreachable | QIDO-RS failures > 50% over 5 min | P2 | runbooks/rad-pacs-unavailable.md |
| Critical finding unnotified | rad_critical_findings_total with no downstream ack after 15 min | P1 | runbooks/rad-critical-finding.md |
| Outbox lag | rad_outbox_unpublished_count > 50 for 5 min | P2 | runbooks/rad-outbox-lag.md |
| Viewer launch latency | p95 > 5 s | P3 | runbooks/rad-viewer-latency.md |