Skip to main content

fraud-intel-service — Service Risk Register

Version: 1.0 Status: Draft Owner: Trust and Safety + ML Ops + Security + SRE Last Updated: 2026-04-21 References: FAILURE_MODES.md, AI_INTEGRATION.md, SECURITY_MODEL.md

Known risks with owners, mitigations, and residual classification. Scored 1–5 Likelihood × Impact; residual must be ≤ Medium for GA.


1. Risk Summary

IDRiskCategoryLikelihoodImpactPre-mitigationResidualOwner
FR-RISK-01Model false positives flood firewall with block-triggering signalsCorrectness / ML34HighMediumT&S + ML Ops
FR-RISK-02Model drift silently degrades detectionML quality44HighMediumML Ops
FR-RISK-03Feedback-loop poisoning by insider or compromised accountSecurity24MediumLowSecurity
FR-RISK-04ML model bias against a specific MNO or tenantML fairness34HighMediumT&S + ML Ops
FR-RISK-05Training-data PII leak via model weightsPrivacy15MediumLowSecurity + Legal
FR-RISK-06Adversarial evasion — attacker crafts signals to bypass detectionAdversarial33MediumMediumML Ops
FR-RISK-07Fraud feed source (MISP) poisoningDependency23MediumLowSecurity
FR-RISK-08Signal storm biases training corpusAdversarial23MediumLowML Ops
FR-RISK-09Model-artifact tampering in registrySecurity15MediumLowSecurity
FR-RISK-10Triton GPU scarcity under burst loadOps23MediumLowSRE
FR-RISK-11GDPR / subject-access on signal historyLegal23MediumLowLegal
FR-RISK-12Fail-open posture during outage delays critical detectionAvailability33MediumLowSRE
FR-RISK-13Downstream coupling: firewall over-reacts to a single model updateIntegration24MediumMediumT&S + SRE
FR-RISK-14Cross-platform fraud-feed partner disputes published IOCsPolitical22LowLowRegulator Liaison
FR-RISK-15Stale features cause short-window detection miss (e.g., AIT spike within a minute)Latency33MediumMediumML Ops

2. Risk Details

FR-RISK-01 — Model false positives flood firewall

High-confidence false positive from model triggers firewall BLOCKs at scale.

Mitigation. Shadow-mode rollout for any new model (14 d); automatic rollback if FP rate > baseline + 50% for 10 min; per-tenant whitelist for design-partner banks; human-in-loop on highest-confidence tier; confidence-threshold tuning per category.

Residual. Medium.


FR-RISK-02 — Model drift

Attacker tactics evolve; model silently loses recall without obvious incident.

Mitigation. Weekly F1 vs. baseline drift monitoring; alert on > 5% drop; quarterly retraining + on-demand retraining on drift alert; A/B shadow of candidate models before switchover; model cards with limitations documented.

Residual. Medium.


FR-RISK-03 — Feedback-loop poisoning

Insider / compromised account labels fraudulent traffic as legitimate.

Mitigation. Feedback API role-restricted to T&S staff with auditable identity; feedback weight in training lower than automatic labelling; training pipeline rejects single-account / IP contributing > 5% of labels in a week; weekly human review of label-distribution trends.

Residual. Low.


FR-RISK-04 — ML fairness (MNO / tenant bias)

Model trained on corpus under-representing legitimate traffic from one MNO.

Mitigation. Balanced training corpus (per-MNO sampling floor); fairness audit in CI (fail if per-MNO disparate recall > 15%); post-launch per-MNO block-rate monitoring; model-card documentation.

Residual. Medium.


FR-RISK-05 — Training-data PII leakage via model weights

Extract-style attacks on model weights could reconstruct MSISDNs.

Mitigation. MSISDN hashed before any training data use; differential-privacy-style noise in feature engineering where practical; model weights not exposed outside trusted inference infrastructure (Triton inside mesh); model registry access is mTLS + role-restricted.

Residual. Low.


FR-RISK-06 — Adversarial evasion

Attacker crafts input variations that slip past the model.

Mitigation. Adversarial corpus (500+ crafted examples per category) tested in CI; quarterly red-team exercise adding new adversarial patterns; defence-in-depth with rule-based fallback that catches simpler patterns the ML might miss.

Residual. Medium — adversarial ML is an arms race.


FR-RISK-07 — MISP feed poisoning

External feed delivers malicious IOCs (e.g., legitimate IP ranges labelled as SIM-box).

Mitigation. Feed source on whitelist; feed-import applies rate-limit + anomaly detection; dual-source corroboration for high-impact entries; ability to roll back a feed import.

Residual. Low.


FR-RISK-08 — Signal storm

Adversarial traffic floods signals to bias training.

Mitigation. Per-source / per-tenant rate-limit on signal ingest; training outlier removal (> 3σ); weekly human review of high-volume sources; signal deduplication by content hash.

Residual. Low.


FR-RISK-09 — Model-artifact tampering

Compromised registry replaces legitimate artifact with a malicious one.

Mitigation. S3 object-lock + versioning on model bucket; checksum verified at upload and at deploy; model deploy requires dual-control; cross-region replicated bucket; quarterly backup-integrity drill.

Residual. Low.


FR-RISK-10 — Triton GPU scarcity

Burst fraud-detection load exceeds GPU capacity.

Mitigation. HPA based on inference-queue depth; CPU fallback for low-confidence tier (batched); capacity plan with 50% headroom; GPU fleet multi-region.

Residual. Low.


FR-RISK-11 — GDPR subject-access

Citizen asks what data fraud-intel holds about them.

Mitigation. Only MSISDN hash retained (pre-computed); response format drafted by Legal; 30-d SLA; deterministic re-hash allows response without retaining raw MSISDN.

Residual. Low.


FR-RISK-12 — Fail-open delays detection

Service outage delays fraud detection for up to outage duration.

Mitigation. Fail-open is intentional (service is informational); downstream consumers (firewall) have rule-based fallback that catches common patterns; re-scan on recovery replays backlog signals.

Residual. Low — accepted posture.


FR-RISK-13 — Downstream coupling

A new model update produces signal pattern shifts that firewall interprets as a surge of fraud, escalating BLOCK rate.

Mitigation. Shadow mode + gradual enablement; downstream consumers subscribe to fraud.model.deployed.v1 event and apply conservative thresholds for 24 h post-deploy; observation-mode dashboards track per-model signal emission vs. downstream action.

Residual. Medium.


FR-RISK-14 — Feed-partner dispute

Cross-platform partner disputes an IOC Ghasi shared.

Mitigation. Per-IOC provenance (who contributed, confidence, source signals); conflict-resolution via Regulator Liaison / partner liaison; removal workflow with audit.

Residual. Low.


FR-RISK-15 — Stale features

Fast-moving attack signatures (e.g., AIT spike in 60 s) aren't caught because features are batched hourly.

Mitigation. Real-time streaming feature computation for high-signal categories (AIT): features updated every 10 s via NATS stream processing; tunable latency/accuracy trade-off.

Residual. Medium — real-time features have cost / complexity implications.


3. Residual-Risk Summary

ResidualCountAcceptance
Low9Accepted for GA
Medium6Accepted with mitigation commitments and named owners
High0

4. Risk Review Cadence

  • Weekly during development.
  • Monthly post-GA (T&S + ML Ops + Security + SRE).
  • Quarterly (model cards, fairness audit, regulator liaison).