Skip to main content

AI Gateway Service — Testing Strategy

Status: populated Owner: TBD Last updated: 2026-04-17 Companion: Service Template · standards/TESTING_STANDARDS.md

1. Coverage targets

LayerTarget
Unit≥ 85% statements, ≥ 80% branches
Integration≥ 80% critical paths
Contract100% of published events + OpenAPI
E2EAll P1 scenarios

2. Unit tests

UnitCoverage
AIDecision state machinevalid + invalid transitions
ProviderRouterrule matching, residency filtering, fallback
QuotaServicewindow rollover, concurrent consume
ModerationPipelinethreshold logic, short-circuit on block
ProvenanceFactoryall required fields stamped
Prompt template resolversemver match, locale fallback
PHI redactorno raw text leaks to log formatter

3. Integration tests (mandatory)

SpecWhat
tenant-isolation.specCross-tenant reads/writes forbidden by RLS and app guard
outbox.specAssist commits decision + outbox row in same tx; relay publishes exactly-once
inbox.specConsumed config.* and tenant.* events deduped
policy-timeout.specAI_POLICY_DENY when access-policy times out
quota-exceeded.spec429 + ai_gateway.quota.exceeded.v1
provider-fallback.specPrimary errors → fallback provider; provider.degraded emitted
moderation-block.spec422 path + flagged event
hitl-flow.specdraft → under_review → accepted; accepted event consumed by owner
phi-logging.specAssert raw instructions and draftText never appear in default event payloads

4. Contract tests

TypeTool
OpenAPI (REST)openapi diff + Dredd
Event schemaAjv against @ghasi/event-envelope
Pact consumerpatient-chart, medication, portal expectations

5. E2E

  • Playwright suite: reviewer dashboard accept/reject flow; portal triage end-to-end; cross-service accept in patient-chart.
  • Load: k6 script, 100 RPS/assist sustained 10 min, ramp to 500 RPS 2 min.

6. Safety & safety-adjacent tests

ScenarioAssertion
Prompt-injection sample setBlock rate ≥ 95% on curated corpus
PHI-sniff sample setBlock rate ≥ 98% on synthetic PHI prompts
Red-team triage promptsNo medical advice without disclaimer + triage escalation paths
Regression vs previous prompt template versionDelta report reviewed by clinical SME

7. Non-functional tests

  • Chaos: provider kill, NATS partition, Redis outage — verify fail-closed behaviour and event delivery recovery.
  • Security: static + dynamic analysis, dependency scan, secret scan.