fraud-intel-service — Service Readiness
Version: 1.0
Status: Draft
Owner: Trust and Safety + ML Ops + SRE
Last Updated: 2026-04-21
References: SERVICE_OVERVIEW.md, _report.md, AI_INTEGRATION.md, FAILURE_MODES.md
Readiness criteria for production deployment. Fraud-intel is fail-open (informational). The bar therefore focuses on: model quality (precision, recall, F1 per category), drift monitoring, fairness, adversarial-robustness, and downstream integration contracts with firewall + sender-id-registry + compliance-engine.
1. Code Readiness
| Criterion | Status | Notes |
|---|
gRPC FraudIntelService.v1 (Score, GetSignals, BulkScore) | ☐ | |
| REST admin: signal browsing, model lifecycle, feed admin, retroactive-scan triggers, dashboard query | ☐ | |
| Triton integration for 3 production models (AIT, SIM-box, OTP-harvest) | ☐ | |
| Rule-based fallback on Triton circuit-breaker open | ☐ | |
| Feature-store integration (Redis + Postgres) with in-process LRU | ☐ | |
| PII anonymisation (MSISDN SHA-256 hash) before any inference | ☐ | Mandatory per AI_INTEGRATION |
| Budget enforcement: Score 100 ms P99 cap, fallback at 80 ms | ☐ | |
| Circuit breaker on Triton with 30 s half-open | ☐ | |
NATS consumers: sms.dlr.inbound, sms.mo.inbound, compliance.audit.v1, firewall.audit.v1, sender.id.suspended.v1 | ☐ | |
| MISP feed sync worker (STIX 2.1) with pluggable source adapter | ☐ | |
| Model registry (Postgres + S3 artifacts) with immutable versioning | ☐ | |
| Training pipeline (Python + Airflow) with quarterly cadence | ☐ | |
| Model deployment pipeline with checksum verification + A/B shadow | ☐ | |
| Drift detection job (weekly F1 vs. baseline) | ☐ | |
| Feedback API (T&S correction loop) with weight-capping | ☐ | |
| Per-signal audit row with model version | ☐ | |
| mTLS gRPC + SPIRE SVID | ☐ | |
2. Testing Readiness
| Criterion | Target | Status |
|---|
| Unit coverage | ≥ 90% line (domain) / ≥ 80% branch | ☐ |
| Unit tests for feature extractors | ≥ 30 | ☐ |
| Unit tests for score normalisers + category mapping | ≥ 20 | ☐ |
| Property-based tests (fast-check): feature determinism | ≥ 10 | ☐ |
| Model evaluation: held-out test set per model | 10 k labelled per category | ☐ |
| Model evaluation targets (AIT) | precision ≥ 0.92, recall ≥ 0.80 | ☐ |
| Model evaluation targets (SIM-box) | precision ≥ 0.88, recall ≥ 0.75 | ☐ |
| Model evaluation targets (OTP-harvest) | precision ≥ 0.90, recall ≥ 0.70 | ☐ |
| Adversarial test corpus per model | ≥ 500 crafted examples | ☐ |
| Fairness audit per model (per-MNO disparate recall ≤ 15%) | Passed | ☐ |
| Integration: Score @ 1 000 RPS P99 ≤ 100 ms | Passed | ☐ |
| Integration: signal NATS consumers sustain 10 000 events/min | Passed | ☐ |
| Integration: MISP feed sync with mock endpoint | Passed | ☐ |
| Integration: model deploy + rollback via registry | Passed | ☐ |
| Contract test with firewall consumer | Passed | ☐ |
| Contract test with sender-id-registry consumer | Passed | ☐ |
| Contract test with compliance-engine tenant scoring feed | Passed | ☐ |
| Chaos: Triton unavailable → rule-based fallback | Passed | ☐ |
| Chaos: Postgres unavailable → read-only degraded | Passed | ☐ |
| Chaos: NATS lag → consumer scaling + stale-signal handling | Passed | ☐ |
| Chaos: feature store partial outage → LRU fallback | Passed | ☐ |
| Security: feedback-loop poisoning resistance (synthetic attack) | Passed | ☐ |
| Security: model-artifact tamper detection via checksum | Passed | ☐ |
3. Observability Readiness
| Criterion | Status |
|---|
| All Prometheus metrics emitting (OBSERVABILITY.md §1) | ☐ |
Grafana dashboard fraud-intel-service.json deployed | ☐ |
| All alerts configured with runbooks | ☐ |
| Structured logs with MSISDN hashing | ☐ |
| OTel tracing across Score calls verified | ☐ |
| Loki parsing validated | ☐ |
SIEM forwarding of fraud.detected.* verified | ☐ |
4. Security Readiness
| Criterion | Status |
|---|
| mTLS on gRPC + SPIRE SVIDs | ☐ |
| NetworkPolicy restricting gRPC ingress to firewall, sender-id-registry, compliance, NOC | ☐ |
| Kong JWT on REST admin endpoints | ☐ |
| MSISDN hashing verified on every model call path | ☐ |
| Model artifact S3 bucket immutable + versioned + object-locked | ☐ |
| Model deploy dual-control + checksum verified | ☐ |
| Feedback API role-restricted (T&S only; auditable identity) | ☐ |
| No cloud-LLM / external-API call with PII | ☐ |
| Training-data PII scrubbing verified | ☐ |
| Pen test against REST admin + gRPC | ☐ |
| Security team sign-off | ☐ |
5. Operational Readiness
| Criterion | Status |
|---|
| K8s Deployment (3–10 replicas) + HPA on RPS | ☐ |
Triton Deployment (3 replicas, GPU) on np-data node pool | ☐ |
| Training Airflow setup on separate node pool (CPU + optional GPU on-demand) | ☐ |
PDB minAvailable: 2 per region | ☐ |
| Rolling update: no dropped Score calls under steady 500 RPS | ☐ |
| Graceful shutdown (15 s SIGTERM) | ☐ |
| Postgres conn pool sized | ☐ |
| Redis conn pool sized | ☐ |
| Model deployment runbook drafted | ☐ |
| Model drift incident runbook drafted | ☐ |
| Feed sync failure runbook drafted | ☐ |
| Feedback poisoning runbook drafted | ☐ |
| On-call: T&S primary, ML Ops secondary, SRE tertiary | ☐ |
6. Documentation Readiness
All 16 SERVICE_TEMPLATE docs at "Complete". Plus runbooks, model cards (AI_INTEGRATION §12 reference), feedback-labeller handbook, and drift-response playbook.
7. Compliance / Regulatory Readiness
| Criterion | Status |
|---|
| DPIA authored for signal processing and ML inference | ☐ |
| Fairness audit signed off by Trust & Safety lead | ☐ |
| Model cards published per model (accuracy, fairness, training data lineage, known limitations) | ☐ |
| MISP reciprocal-sharing terms agreed with external parties (if any) | ☐ |
| SIEM forwarding of fraud events approved by regulator-portal team | ☐ |
| Audit retention policy configured (90 d hot, 7 y cold for detections) | ☐ |
8. Go/No-Go Criteria Summary
Production deployment is GO when:
9. Post-Launch Review
Within 30 days:
10. Phased Rollout
| Phase | Duration | Behaviour | Exit criteria |
|---|
| P1 — Signals emitted, not enforced | 14 d | fraud.detected.* published; downstream consumers log but don't act on them | FP projection < 2% per category |
| P2 — Enforcement: sender-id reputation only | 7 d | Sender-ID registry honours fraud signals (reputation updates); firewall still observes | No unexpected auto-suspension cluster |
| P3 — Full Enforcement | Ongoing | Firewall + compliance engine honour fraud signals; NOC dashboards live; feedback loop active | Steady state |
Rollback flags: FRAUD_SIGNAL_EMISSION_ENABLED, FRAUD_ML_ENABLED, FRAUD_FEEDBACK_API_ENABLED.