Skip to main content

Fraud Intelligence Service — Jira-Ready Epics & User Stories

Status: populated Owner: Trust & Safety Last updated: 2026-04-20 Service prefix: FRAUD Scope: Telecom fraud detection — AIT, SIM-box, OTP harvesting/grinding, grey-route arbitrage, MISP-compatible feed export/import; ClickHouse-backed feature store + ML pipelines + synchronous Score gRPC. Source of truth: _sources/fraud-intel-service/user_stories.md


Epic Summary

Epic IDTitleStoriesPoints
EP-FRAUD-01AIT Detection — Graph + MLUS-FRAUD-001 – US-FRAUD-00639
EP-FRAUD-02SIM-Box / Grey-Route Detection on Inbound MO PatternsUS-FRAUD-007 – US-FRAUD-01128
EP-FRAUD-03OTP Harvesting & OTP-Grinding DetectionUS-FRAUD-012 – US-FRAUD-01523
EP-FRAUD-04Fraud Feed (MISP-compatible) Export and Import + Case ManagementUS-FRAUD-016 – US-FRAUD-01926
Total19 stories116

EP-FRAUD-01 · AIT Detection — Graph + ML

Context: Per ADR-0004 §3, AIT (Artificially Inflated Traffic) is the most common revenue-extracting fraud against the platform. Detection runs in 5-minute windows on a ClickHouse feature store with a per-tenant XGBoost model plus a graph feature for cross-tenant cohorts.

US-FRAUD-001 · Stream Ingestion into ClickHouse Feature Store

Type: Feature | Points: 8

Description: As a detection pipeline, I need NATS streams (firewall.audit.v1, sms.events.status.v1, sms.dlr.inbound.v1, cdr.generated.v1) ingested into ClickHouse fraud_features.events with bounded lag.

Acceptance Criteria:

  • Normalised row schema: (event_ts, source_stream, tenant_id?, src_msisdn?, dst_msisdn?, sender_id?, mno_id?, peer_asn?, verdict?, dlr_status?, attempt_count, payload_hash)
  • Sustained 10K eps → P95 ingestion lag ≤ 30 s
  • Schema-fail rows → fraud_features.events_dlq with reject_reason; upstream NATS ACK'd
  • ClickHouse unavailable → on-disk WAL buffer (1 h × 10K eps); resume on recovery
  • Durable consumer fraud-ingestor with AckExplicit resumes from last ACK
  • Metrics: fraud_ingestion_lag_seconds{source_stream}, fraud_ingestion_rows_total{source_stream}, fraud_ingestion_dlq_total

US-FRAUD-002 · AIT Feature Engineering Pipeline

Type: Feature | Points: 5

Description: As the AIT detection model, I need 5-minute feature aggregations per (tenant_id, dst_mno, sender_id, window) populated into fraud_features.ait_window_features.

Acceptance Criteria:

  • Cron */5 * * * *; features: submit_count, dlr_delivered_count, dlr_failed_count, dlr_success_rate, unique_dst_msisdns, mean_segments_per_msg, entropy_of_dst_prefix, unique_sender_ids, repeated_body_ratio, peer_asn_diversity
  • Job runtime < 90 s for 1000 active tenants × 5 MNOs
  • Null imputation per fraud.feature_imputation_policy
  • Synthetic AIT campaign unit test → expected feature signature

US-FRAUD-003 · XGBoost AIT Detection Model Inference

Type: Feature | Points: 8

Description: As the detection runtime, I need to run the active XGBoost AIT model against every new feature window and emit fraud.detected.ait.v1 for high-confidence predictions.

Acceptance Criteria:

  • Active model from fraud.models WHERE class='AIT' AND active=TRUE
  • Predictions written to fraud_features.ait_predictions with (window_start, tenant_id, dst_mno, sender_id, score, model_id, model_version, shap_top3 jsonb)
  • score >= 0.85fraud.detected.ait.v1 with aiProvenance and evidence
  • 0.6 <= score < 0.85fraud.case.opened.v1 + fraud.cases row status='PENDING_REVIEW'
  • score < 0.6 → log only
  • Frozen-test-set AUC ≥ 0.92, FPR ≤ 0.5% at 0.85 threshold

US-FRAUD-004 · AIT Graph Feature: Cross-Tenant Recipient Cohort

Type: Feature | Points: 8

Description: As the AIT model, I need a graph feature that detects when a recipient MSISDN cohort is targeted by multiple tenants in a tight window (cross-tenant AIT ring signature).

Acceptance Criteria:

  • Cohort job 1-h rolling window populates fraud_features.ait_cohorts with (cohort_hash, dst_msisdn_count, contributing_tenants, contributing_sender_ids, first_seen_ts, last_seen_ts)
  • cohort_hash = sha256(sorted_unique(dst_msisdns)) for cohorts ≥ 5
  • ≥ 3 distinct tenants AND ≥ 50 dst MSISDNs → join cohort_anomaly_score into ait_window_features
  • dst_msisdn_count >= 100 AND contributing_tenants >= 5fraud.detected.ait_ring.v1 independent of per-tenant model
  • Job runtime < 5 min for 1 M unique recipients
  • Cohort allowlist fraud.cohort_exclusions excludes legitimate cross-tenant alerts

US-FRAUD-005 · Synchronous Score gRPC

Type: Feature | Points: 5

Description: As compliance-engine, I need to call fraud-intel-service.Score(scope, id) over gRPC and receive the current fraud score for use in tenant-tier evaluation.

Acceptance Criteria:

  • Score(scope='TENANT'|'SENDER_ID'|'MSISDN', id) returns FraudScore { score, tier, contributingFactors[], modelId, modelVersion, computedAt, traceId } within P95 ≤ 50 ms
  • Redis cache fraud:score:{scope}:{id} TTL 15 m; cache-hit P99 ≤ 10 ms
  • Cache miss → read from fraud_features.entity_scores materialised view + populate cache
  • Unscored entity (no events 30 d) → tier='PROBATION', score=0.5
  • Non-allowlisted SPIFFE ID → PERMISSION_DENIED
  • Caller treats unavailability as tier='PROBATION'

US-FRAUD-006 · Per-Tenant Score Hourly Publication

Type: Feature | Points: 5

Description: As compliance-engine, I need fraud.tenant_score.updated.v1 events on tier-boundary crossings.

Acceptance Criteria:

  • Hourly recompute job emits event when tier changes (SAFE↔WATCH↔RISKY↔HIGH_RISK); deduplicated per (tenantId, tier)
  • Event payload: { tenantId, previousTier, newTier, score, contributingFactors[], modelVersion, computedAt }
  • Compliance consumer compliance-engine-fraud-score updates compliance.tenant_tiers
  • Recompute failure preserves prior tier + fraud_score_recompute_failed_total increments

EP-FRAUD-02 · SIM-Box / Grey-Route Detection on Inbound MO Patterns

US-FRAUD-007 · SIM-Box Detection on Inbound MO Patterns

Type: Feature | Points: 8

Description: As the national perimeter, I need SIM-box detection from inbound MO patterns (sequential MSISDN ranges, identical body templates, low DLR-success, unusual IMSI-MSISDN binding).

Acceptance Criteria:

  • 30-min rolling window features per (msisdn_block /28): msisdn_range_density, body_template_hash_concentration, hlr_mismatch_rate, imsi_unique_count, mno_bind_concentration
  • Threshold breach (density>0.6 AND template>0.4 AND hlr_mismatch>0.3) → fraud.detected.simbox.v1
  • Firewall promotes implicated MSISDN block to firewall.peer_quarantine with 7-day expiry
  • 24 h cluster of detections sharing ASN → fraud.detected.simbox_network.v1
  • HITL dismiss → negative label feeds nightly retrain

US-FRAUD-008 · Grey-Route Arbitrage Detection (24 h Window)

Type: Feature | Points: 5

Description: As Trust & Safety, I need long-running grey-route arbitrage detection by peer aggregators routing high volumes to non-peered MNOs.

Acceptance Criteria:

  • 24 h hourly job per peer features: total_mt, mt_to_peered_mno, mt_to_non_peered_mno, mt_to_non_peered_ratio, hlr_mismatch_rate, dlr_success_rate_anomaly
  • mt_to_non_peered_ratio > 0.3 AND total_mt > 1000fraud.detected.greyroute.v1
  • Confidence ≥ 0.85 → auto-promote to firewall.peer_quarantine
  • Confidence 0.6–0.85 → fraud.case.opened.v1
  • Trust & Safety allowlist suppresses + audit-logs

US-FRAUD-009 · SIM-Box Confidence Calibration (HITL Feedback Loop)

Type: Feature | Points: 5

Description: As a data scientist, I need to capture HITL decisions (CONFIRM_FRAUD / DISMISS / REFINE_FEATURES) and feed them into nightly model retraining.

Acceptance Criteria:

  • POST /v1/admin/fraud/cases/{caseId}/decide persists (caseId, operatorId, decision, reason, decidedAt) in fraud.case_decisions
  • Nightly retrain 02:00 Asia/Kabul reads last 30 d decisions → updated training set
  • New version with AUC > current → 24 h shadow-evaluation
  • Shadow pass → atomic promotion via POST /v1/admin/fraud/models/{id}/promote
  • Shadow fail → auto-reject + alert

US-FRAUD-010 · Inbound MO Body-Template Clustering

Type: Feature | Points: 5

Description: As the SIM-box detector, I need normalised body-template clustering to detect bulk-template SIM-box traffic.

Acceptance Criteria:

  • Body normalisation: digits→#, whitespace collapse, case-fold; template_hash = sha256(normalised) stored alongside event
  • 1-h cluster job populates fraud_features.template_clusters with (template_hash, occurrence_count, distinct_src_msisdns, distinct_dst_msisdns)
  • occurrence_count > 1000 AND distinct_src_msisdns > 50 → flagged + joined into simbox_features
  • fraud.template_allowlist excludes carrier-system templates

US-FRAUD-011 · DLR-Success-Rate Anomaly Per Terminating MNO

Type: Feature | Points: 5

Description: As the detection runtime, I need per-tenant per-terminating-MNO DLR success-rate baselines and 3σ-deviation alerting.

Acceptance Criteria:

  • Rolling 7-d baseline per (tenantId, dstMno); hourly anomaly check against current 1-h window
  • Deviation > 3σ → fraud_features.dlr_anomalies row + fraud.detected.dlr_anomaly.v1 (feature event, not terminal)
  • Insufficient baseline (< 7 d activity) → check skipped + tier='PROBATION'
  • Daily 03:00 Asia/Kabul refresh of materialised view fraud_features.dlr_baseline_7d

EP-FRAUD-03 · OTP Harvesting & OTP-Grinding Detection

US-FRAUD-012 · OTP-Keyword Detection in Outbound Messages

Type: Feature | Points: 5

Description: As the OTP-harvesting detector, I need outbound messages tagged is_otp_likely based on a versioned multilingual regex set.

Acceptance Criteria:

  • Regex set in fraud.otp_patterns (versioned); covers EN/Pashto/Dari OTP keywords
  • Body matched with case-fold + Unicode-NFC; events.is_otp_likely = TRUE on match
  • Pattern-set update → 7-d rolling backfill (background)
  • Tagged with otp_destination_class ∈ {GENERIC, BANK, GOV, OPERATOR_INTERNAL} from sender-id-registry lookup

US-FRAUD-013 · OTP Harvesting: Cohort + Recipient-Yield Heuristic

Type: Feature | Points: 8

Description: As the detection runtime, I need to detect OTP harvesting by joining OTP-likely outbound with consent.revoked.v1 from the same recipient cohort.

Acceptance Criteria:

  • 6-h join: events.is_otp_likely AND consent.revoked.v1 within 1hfraud_features.otp_harvesting_signals (tenantId, dstMsisdnCohortHash, otpCount, revocationCount, revocationRate)
  • revocationRate > 0.05 AND cohort >= 100fraud.detected.otp_harvesting.v1
  • Confidence ≥ 0.85 → fraud case opened with suggestedAction = SUSPEND_SENDER_ID (NOC executes only)
  • Bank/Gov allowlist → downgrade to flag only + audit

US-FRAUD-014 · OTP Grinding: Per-MSISDN Burst Detection

Type: Feature | Points: 5

Description: As the subscriber-protection brain, I need per-MSISDN OTP-attempt burst detection (≥ 10 OTP-class messages in ≤ 60 s) and auto-throttling.

Acceptance Criteria:

  • Streaming aggregator computes per-dstMsisdn rolling 60 s OTP count
  • Burst (count > 10) → fraud.detected.otp_grinding.v1 with { dstMsisdn (hashed), srcTenants[], srcSenderIds[], windowStart, windowEnd }
  • compliance-engine consumer throttles further OTP-class messages to that MSISDN to 1/60 s for 6 h
  • State in Redis fraud:otp:dst:{msisdn}:60s survives pod restarts
  • Throttle window expires → normal traffic resumes

US-FRAUD-015 · OTP Round-Trip Delivery Anomaly Detection

Type: Feature | Points: 5

Description: As the OTP-harvesting detector, I need to track OTP submit→DLR-DELIVERED round-trip distributions per tenant and flag artificial uniformity.

Acceptance Criteria:

  • Per-tenant 1-h distribution: fraud_features.otp_dlr_distribution (tenantId, p50_ms, p95_ms, p99_ms, stddev_ms, sample_count)
  • stddev_ms < 100 AND sample_count > 500fraud.detected.dlr_uniformity.v1
  • Metric exposed for fraud-analyst dashboards
  • Repeated NOC dismiss → auto-tune via feedback loop

EP-FRAUD-04 · Fraud Feed (MISP-compatible) Export and Import

US-FRAUD-016 · MISP-Compatible Fraud-Feed Export (Signed)

Type: Feature | Points: 8

Description: As regulator + peer MNOs, I need a daily HSM-signed MISP 2.4.x / STIX 2.1 feed of confirmed fraud indicators.

Acceptance Criteria:

  • Cron 04:00 Asia/Kabul → MISP event with fraud.cases WHERE status='CONFIRMED' AND confirmed_at > now()-24h; attributes (msisdn, sender-id, asn, template-hash, msisdn-block); STIX 2.1 indicators
  • HSM PKCS#11 signature; uploaded to MinIO fraud-feed-out/{yyyymmdd}.misp.json.sig and .stix.json.sig
  • fraud.feed.exported.v1 published with SHA-256, signature, presigned URLs (24 h)
  • Zero-diff days → fraud.feed.heartbeat.v1
  • Regulator SFTP mirror within 5 min of upload

US-FRAUD-017 · MISP Feed Import from Peer MNOs / Regulator

Type: Feature | Points: 5

Description: As the detection runtime, I need to consume MISP feeds from peers and regulator and merge indicators with source attribution and decay.

Acceptance Criteria:

  • POST /v1/internal/fraud/feed/import (mTLS) → verify signature against regulator HSM public key
  • Upsert into fraud.misp_indicators (source, sourceUuid, type, value, confidence, decay_factor, importedAt)
  • decay_factor halves indicator weight every 30 d
  • type='msisdn' indicators join events.imported_indicator_msisdn = TRUE on matching events
  • Invalid signature → fraud.alert.feed.signature.invalid.v1 (PagerDuty)
  • Idempotent on (source, sourceUuid)

US-FRAUD-018 · Fraud Case Management REST

Type: Feature | Points: 8

Description: As a fraud analyst, I need to manage the case queue: list, view evidence, decide, assign — with audit trail and separation of duties.

Acceptance Criteria:

  • GET /v1/admin/fraud/cases?status=PENDING_REVIEW&page=1&pageSize=50 paginated list (role tns-fraud-analyst)
  • GET /v1/admin/fraud/cases/{caseId} returns full case with feature vector, SHAP values, source events, MISP joins
  • POST /v1/admin/fraud/cases/{caseId}/decide { decision, reason, executeAction? } persists + fraud.case.decided.v1
  • executeAction=true + decision=CONFIRM_FRAUD → dispatch suggested action (NATS to firewall / sender-id-registry); each dispatch audit-logged with token
  • Separation of duties: case.opened_by != decided_by; otherwise 403
  • Stale (>30 d) auto-closes as STALE + fraud.case.auto_stale.v1

US-FRAUD-019 · Model Catalog + Promote / Rollback REST

Type: Feature | Points: 5

Description: As a data scientist, I need to register, shadow, promote, and roll back ML model versions via authenticated REST.

Acceptance Criteria:

  • POST /v1/admin/fraud/models { class, modelArtifactUrl, trainingSetHash, evaluationMetrics } creates row status='REGISTERED'
  • POST /v1/admin/fraud/models/{id}/shadow enables shadow predictions in fraud_features.shadow_predictions; no events emitted
  • POST /v1/admin/fraud/models/{id}/promote atomic swap to active + fraud.model.promoted.v1
  • POST /v1/admin/fraud/models/{id}/rollback restores prior active + fraud.model.rolled_back.v1
  • Promote without ≥ 24 h shadow → 412 SHADOW_EVAL_INSUFFICIENT
  • Artifact SHA-256 mismatch → 422 MODEL_ARTIFACT_INTEGRITY_FAIL