Readiness
:::info Source
Sourced from services/ai-gateway-service/SERVICE_READINESS.md in the documentation repo.
:::
1. Level per Milestone
| M | Level | Scope |
|---|---|---|
| M0 | L3 | Gateway + safety + prompt registry + AIClient port |
| M1 | L3 | AI tutor + local inference |
| M2 | L3 | AI co-author prompts |
| M3 | L4 | Full model routing + cost controls + audit UI |
| M4 | L4 | Advanced eval harness |
| M5 | L4 | AI insights v2 / predictive / on-device |
2. Gates
G1 Domain
- Prompt, Model, AICompletion, Embedding, AIBudget, SafetyPolicy, SafetyVerdict, AIAuditEntry.
- Invariants (provenance, safety mandatory).
- 95%/80% mutation.
G2 API
- AIClient port defined (F09 frozen).
- OpenAPI + SSE documented.
- Pact with every consumer service.
G3 Events
- Subjects registered.
- Audit events to regulated firehose.
G4 Sync
- Prompts replicable for offline (subset flagged offline-safe).
- Local AI config replicable.
G5 AI (self)
- Prompts versioned + eval-gated.
- Safety pipeline mandatory.
- Provenance VO frozen (F04).
- Bias monitoring quarterly.
- HITL on high-risk prompts.
G6 Observability
- Cost + safety + cache + provider health dashboards.
- Bias scorecard visible.
G7 Performance
- First-token p95 < 600ms.
- Embed p95 < 300ms.
- 1k concurrent completions / region.
G8 Security
- Tenant iso on embeddings + prompts.
-
noTrainenforced. - PII redaction mandatory.
- HIPAA allowlist tested.
- Audit Merkle-anchored.
- Two-tenant iso green.
3. SLOs
| SLI | Target |
|---|---|
| First-token p95 | < 600ms |
| Availability (degraded OK) | 99.9% |
| Safety pipeline availability | 99.99% |
| Budget enforcement | 100% correctness |
| Prompt eval pass rate | 100% of active prompts |
4. DoD
- Tests green (unit + integration + safety + Pact).
- Prompt eval pass.
- Bias eval for high-risk changes.
- OpenAPI + AIClient contract verified.
- Two-tenant iso.
5. Release Checklists
S0 (M0 — Foundation)
- AIClient port shipped + Pact contracts.
- 10 system prompts with eval suites.
- Safety pipeline GA.
- Provenance invariant enforced.
- Prompt registry + admin UI.
S1 (M1 — Tutor + Local)
- Tutor prompts ≥ 50% accept-rate.
- Local inference for offline tutor.
- Red-team corpus passes.
- Local-cloud parity eval.
S2 (M2 — Co-Author)
- Co-Author prompts ≥ 50% accept-rate.
- Provenance badge in UI.
S4 (M3 — Full Routing)
- Model routing policy (F32 frozen).
- Cost dashboards GA.
- Per-tenant budget UI.
- EU AI Act high-risk documentation.
S6 (M5)
- AI insights v2 prompts.
- On-device models in bundles.
- HIPAA provider allowlist enforcement.
- Bias scorecard quarterly report.