Skip to main content

Developer Portal Service — Observability

Version: 1.0 Status: Draft Owner: Product + Developer Relations (DevRel) Last Updated: 2026-04-20


1. Intent

Define SLIs, SLOs, dashboards, alerts, and runbook references for the Developer Portal. SLOs are user-impact-aligned: docs must render fast, sandbox must respond, key revocation must propagate within budget, and Verify must succeed.

2. Service Level Indicators (SLIs)

  • Docs LCP (p95) — TBD
  • Sandbox availability — TBD
  • Verify start success rate — TBD
  • Verify check success rate — TBD
  • Key revocation propagation latency (p95 / p99) — TBD
  • Snippet generator latency (p95) — TBD
  • SDK release pipeline run-time — TBD

3. Service Level Objectives (SLOs)

SLOTargetWindow
Docs LCP p95≤ 1.5 s30 d
Sandbox availability99.9%30 d
Verify start success≥ 99.5% (excl. recipient errors)30 d
Key revocation propagation p95≤ 60 s30 d

4. Dashboards

  • TBD (Grafana: portal overview, Verify funnel, sandbox health)

5. Alerts

  • TBD (PagerDuty rules: revocation lag, Verify error spike, sandbox 5xx, SDK release pipeline failure)
  • TBD (runbooks/devportal-key-revocation-lag.md, runbooks/devportal-verify-error-spike.md)

7. Synthetic Probes

  • Key revocation prober (1 Hz). TBD
  • Sandbox POST /v1/sms/send prober (1/min). TBD
  • Verify happy-path prober (1/5min). TBD