Skip to main content

Fraud Intelligence Service — Domain Model

Version: 1.0 Status: Draft Owner: Trust and Safety Last Updated: 2026-04-21 Companion: SERVICE_OVERVIEW · APPLICATION_LOGIC · DATA_MODEL · EVENT_SCHEMAS · API_CONTRACTS · AI_INTEGRATION Related ADR: ADR-0004 National-Backbone Resilience §3 Reference taxonomy: GSMA FF.21 (A2P SMS Fraud Reference) · MEF MEF-W63 (Inter-Carrier Fraud) · MITRE ATT&CK Mobile T1660 (SMS Control)


1. Bounded Context

Trust & Safety / Fraud Intelligence. The fraud-intel-service owns the inferential detection plane: cross-message, cross-tenant, cross-MNO patterns that no single rule can express. It is a peer to compliance-engine (per-message policy) and sms-firewall-service (perimeter rules); it is not the enforcement plane — every detection is published as a NATS event consumed by other services to action.

The context boundary is drawn such that:

  • Inside the boundary: raw fraud signals, engineered features, fraud detections (ML and graph), fraud cases, model registry and versioned artifacts, MISP/STIX indicators imported and exported, per-entity fraud scores, HITL feedback decisions, training-set provenance.
  • Outside the boundary: message storage (sms-orchestrator), DLR correlation primitives (dlr-processor), per-message verdicts (compliance-engine), perimeter rule enforcement (sms-firewall-service), sender-ID lifecycle (sender-id-registry-service), billing CDRs (cdr-mediation-service), tenant onboarding (tenant-service), regulator reporting (regulator-portal-service).

The service is asynchronous-by-design with one synchronous gRPC entry-point (Score) used by compliance-engine and routing-engine.


2. Aggregates

FraudSignal

A normalised, append-only record of a single piece of evidence projected from one of the upstream NATS streams (firewall.audit.v1, sms.events.status.v1, sms.dlr.inbound.v1, cdr.generated.v1, consent.revoked.v1). One signal is one row in fraud.signals and one row in the ClickHouse fraud_features.events columnar projection.

FieldTypeNotes
signalIdUUIDv4Identity (prefix fs_)
eventTsInstantEvent time as published
sourceStreamenum SourceStreamFIREWALL_AUDIT · SMS_STATUS · SMS_DLR · CDR · CONSENT_REVOKED
tenantIdUUIDv4 | nullOwning tenant (null for some MO/inbound)
srcMsisdnE.164 | nullOriginating MSISDN (hashed in cold paths)
dstMsisdnE.164 | nullDestination MSISDN
senderIdstring | nullSender ID for outbound
mnoIdstring | nullTerminating MNO code (AWCC, MTN, ROSHAN, ETISALAT, SALAAM)
peerAsnint | nullBGP ASN of inbound peer
verdictstring | nullFirewall verdict if sourceStream = FIREWALL_AUDIT
dlrStatusstring | nullSMPP status if sourceStream = SMS_DLR
templateHashsha256 | nullNormalised body template hash (digit-folded, NFC)
payloadHashsha256sha256(canonicalised event JSON) — dedup
attemptCountintPer-message retry count
isOtpLikelybooleanSet by OTP-pattern matcher
otpDestinationClassenum | nullGENERIC · BANK · GOV · OPERATOR_INTERNAL
ingestedAtInstantTime entered fraud-intel-service
traceIdstringW3C trace context

Invariants

  • Append-only at the database level (Postgres rule rejects UPDATE/DELETE; ClickHouse table is MergeTree with TTL).
  • payloadHash collision within 5-minute dedup window suppresses duplicate ingestion.
  • A signal whose sourceStream does not match its declared shape is routed to fraud_features.events_dlq with rejectReason and the upstream NATS message is ACK'd to prevent replay storms.

FraudDetection

A high-confidence finding that some FraudCategory is occurring with confidence ≥ 0.85, produced by a model run or rule-based pattern matcher. Each detection is persisted, emitted as a NATS event, and tracked through its enforcement lifecycle.

FieldTypeNotes
detectionIdUUIDv4Identity (prefix fd_)
categoryFraudCategory VOAIT / SIMBOX / OTP_HARVEST / OTP_GRINDING / GREY_ROUTE / SENDER_ID_ABUSE / DLR_UNIFORMITY / AIT_RING
subjectScopeenumTENANT · SENDER_ID · MSISDN · MSISDN_BLOCK · PEER_ASN · MSISDN_COHORT
subjectIdstringIdentifier within scope
scoreFraudScore VO0.0 .. 1.0
confidenceTierenumLOW (<0.6) · MEDIUM (0.6-0.85) · HIGH (≥0.85)
evidenceJSONBWindow bounds, feature vector ref, contributing event IDs, SHAP top-3
aiProvenanceAiProvenance VO{ modelId, modelVersion, trainingSetHash, featureSetHash, runtimeMs }
windowStart, windowEndInstantDetection window
sourceModelIdUUIDv4FK → MlModel
sourcePipelineenumXGBOOST_AIT · GRAPHSAGE_COHORT · IFOREST_OUTLIER · RULE_PATTERN · STREAMING_BURST
enforcementStatusenumEMITTED · CONSUMED · ENFORCED · SUPPRESSED · EXPIRED
suppressionReasonstring | nullIf subject is on platform allowlist
createdAt, expiresAtInstantDetection valid for expiresAt - createdAt

Invariants

  • confidence ≥ 0.85 is required for an aggregate to exist as a FraudDetection. Lower-confidence findings live as FraudCase (see below) or as raw model predictions in ClickHouse.
  • Every FraudDetection carries aiProvenance so a regulator dispute can be defended against the exact model artifact that produced it.
  • enforcementStatus is a one-way state machine: EMITTED → CONSUMED → ENFORCED (or SUPPRESSED / EXPIRED). No back-transitions.

FraudCase

A medium-confidence finding (0.6 ≤ score < 0.85) requiring human-in-the-loop adjudication by a Trust & Safety analyst. Cases are the workflow object behind the fraud-analyst REST surface.

FieldTypeNotes
caseIdUUIDv4Identity (prefix fc_)
categoryFraudCategory VOSame enum as detections
subjectScope, subjectId(as above)
scoreFraudScore VO0.6 ≤ score < 0.85
evidenceJSONBSame shape as detection
aiProvenanceAiProvenance VO
suggestedActionenumBLOCKLIST_MSISDN · QUARANTINE_MSISDN_BLOCK · SUSPEND_SENDER_ID · DEPEER_PEER_ASN · THROTTLE_TENANT · NO_ACTION
statusenumPENDING_REVIEW · IN_REVIEW · CONFIRMED · DISMISSED · REFINE_FEATURES · STALE
assignedToUUIDv4 | nulluserId of the assigned analyst
openedAtInstant
openedByenum/UUIDsystem:auto or userId
decidedAtInstant | null
decidedByUUIDv4 | nulluserId
reasonstring | nullAnalyst-supplied rationale
actionExecutedbooleanWhether the analyst dispatched the suggested action

Invariants

  • case.openedBy != case.decidedBy (separation of duties) — enforced by API; same actor may not decide their own auto-opened case via UI of self-assignment.
  • A case with openedAt < now() - 30d AND status IN ('PENDING_REVIEW','IN_REVIEW') auto-closes as STALE and emits fraud.case.auto_stale.v1.
  • Status transitions: PENDING_REVIEW → IN_REVIEW → {CONFIRMED, DISMISSED, REFINE_FEATURES} or → STALE from PENDING/IN_REVIEW.
  • A case marked CONFIRMED with actionExecuted = true MUST have produced a corresponding fraud.case.action_dispatched.v1 event with the issued NATS subject and downstream service ack.

FraudFeed

The cross-MNO and regulator threat-intel exchange surface. One FraudFeed is one logical feed source (e.g. regulator-atra, peer-mnt-roshan, peer-awcc, internal-export). Indicators within a feed are versioned and decay.

FieldTypeNotes
feedIdUUIDv4Identity (prefix ff_)
feedNamestringe.g. regulator-atra
directionenumIMPORT · EXPORT
formatenumMISP_2_4 · STIX_2_1
transportenumHTTP_PUSH · SFTP_MIRROR · S3_BUCKET
signaturePolicyenumHSM_REQUIRED · HSM_OPTIONAL · NONE_INTERNAL_ONLY
publicKeyRefstring | nullVault path to the peer's signing public key (import only)
decayProfileDecayProfile VO{ halfLifeDays, floorWeight }
lastSyncAtInstant | nullLast successful import/export
nextRunAtInstantCron next-run
isActiveboolean

FeedIndicator is the contained entity:

FieldTypeNotes
indicatorIdUUIDv4Identity
feedIdUUIDv4FK
sourceUuidstringUpstream MISP/STIX UUID (idempotency)
typeenum IndicatorTypeMSISDN · SENDER_ID · ASN · MSISDN_BLOCK · TEMPLATE_HASH · URL · IP_CIDR
valuetextIndicator value
confidencefloat (0..1)As reported by source
decayFactorfloat (0..1)Computed as 0.5 ^ ((now - importedAt) / halfLifeDays)
tagstext[]Free-form (e.g. simbox, aws-aggregator)
importedAtInstant
expiredAtInstant | null

Invariants

  • (feedId, sourceUuid) is unique — re-import is idempotent.
  • A feed with signaturePolicy = HSM_REQUIRED MUST have publicKeyRef populated; import is rejected otherwise.
  • An indicator with decayFactor < decayProfile.floorWeight is excluded from feature-store joins (effectively expired).

FraudPattern

A rule-based (non-ML) pattern definition for cases where ML is over-engineered or where regulator mandates a deterministic rule (e.g. an explicit MSISDN block). Patterns produce FraudDetection rows the same way ML pipelines do, but the sourcePipeline = RULE_PATTERN and aiProvenance.modelId = 'rule:<patternId>'.

FieldTypeNotes
patternIdUUIDv4Identity (prefix fp_)
namestringHuman-readable
categoryFraudCategory VO
predicateJSONBDiscriminated per category (e.g. { kind: 'MSISDN_BLOCK_LIST', value: '+9379000XXXX' })
confidencefloatStatic confidence assigned to all matches
isActiveboolean
versionintBumped on edit
createdBy, updatedByUUIDv4

MlModel and ModelVersion

The model registry. MlModel is the logical model (one per (category, pipeline)); ModelVersion is the immutable artifact-pointer with provenance.

MlModel:

FieldTypeNotes
modelIdUUIDv4Identity (prefix ml_)
categoryFraudCategory VOAIT · SIMBOX · OTP_HARVEST · GREY_ROUTE
pipelineenumXGBOOST · GRAPHSAGE · IFOREST · TRANSFORMER_TEXT
descriptionstring
activeVersionIdUUIDv4 | nullFK → ModelVersion; exactly one active per (category, pipeline)
shadowVersionIdUUIDv4 | nullFK → ModelVersion; at most one shadow at a time
createdAt, updatedAtInstant

ModelVersion:

FieldTypeNotes
versionIdUUIDv4Identity (prefix mv_)
modelIdUUIDv4FK
versionsemvere.g. 2.1.4
artifactUristringMinIO s3://fraud-models/<modelId>/<version>.tar.gz
artifactSha256sha256Integrity check at load time
cosignSignaturestring | nullOptional Sigstore/cosign chain
trainingSetHashsha256Provenance — exact training rows, computed pre-fit
featureSetHashsha256Feature contract hash (column names + encoders)
evaluationMetricsJSONB{ auc, f1, precision, recall, fprAtThreshold, calibration: {...} }
modelCardUristringMinIO YAML model card
statusenumREGISTERED · SHADOW · ACTIVE · RETIRED · REJECTED
registeredAtInstant
promotedAtInstant | null
retiredAtInstant | null
registeredBy, promotedByUUIDv4

Invariants

  • For each (MlModel.category, MlModel.pipeline) exactly one ModelVersion.status = ACTIVE. Enforced by partial unique index.
  • A ModelVersion cannot be deleted; RETIRED is the only end state.
  • Promotion requires status = SHADOW for ≥ 24 h with evaluationMetrics.auc > activeVersion.evaluationMetrics.auc AND evaluationMetrics.calibration.brier ≤ activeVersion.evaluationMetrics.calibration.brier × 1.05. Otherwise 412 SHADOW_EVAL_INSUFFICIENT.
  • Artifact load verifies sha256(downloaded) == artifactSha256; mismatch → service refuses to register the version active and emits fraud.model.artifact.tamper.v1 to SOC.

FeedbackDecision

The HITL signal — a Trust & Safety analyst's decision on a FraudCase. Forms the labelled training corpus for the next retrain cycle.

FieldTypeNotes
decisionIdUUIDv4Identity
caseIdUUIDv4FK
decisionenumCONFIRM_FRAUD · DISMISS · REFINE_FEATURES
reasonstringAnalyst-supplied (mandatory ≥ 20 chars)
decidedAtInstant
decidedByUUIDv4
featureCorrectionsJSONB | nullIf REFINE_FEATURES: corrected feature values for re-engineering

3. Value Objects

VOShapeInvariants
FraudCategoryenum: AIT · AIT_RING · SIMBOX · SIMBOX_NETWORK · OTP_HARVEST · OTP_GRINDING · GREY_ROUTE · SENDER_ID_ABUSE · DLR_UNIFORMITY · PHISHING · SPAMAligned with GSMA FF.21 fraud taxonomy
FraudScorefloat in [0.0, 1.0]Calibrated via Platt scaling per model; comparable across categories
FraudTierSAFE (≥0.85 in safe direction) · WATCH · RISKY · HIGH_RISK · PROBATION (no data)Used by Score gRPC consumers
Confidencefloat in [0.0, 1.0]Model-reported posterior probability
AiProvenance{ modelId, modelVersion, trainingSetHash, featureSetHash, shapTop3, runtimeMs }All fields mandatory for ML-derived detections
DecayProfile{ halfLifeDays: int, floorWeight: float }Default { 30, 0.05 }
EnforcementSubject(scope, id) tuplescope ∈ {TENANT, SENDER_ID, MSISDN, MSISDN_BLOCK, PEER_ASN, MSISDN_COHORT}
WindowSpec{ size: Duration, slide: Duration }e.g. {5m, 5m} for AIT, {30m, 5m} for SIM-box
IndicatorTypeenum (see FeedIndicator)Maps directly to MISP attribute types and STIX 2.1 indicator pattern objects
MsisdnHashsha256(msisdn + tenantSalt)Used in events to avoid leaking subscriber MSISDNs cross-context

4. Domain Events (produced)

Detailed schemas in EVENT_SCHEMAS.md.

EventTrigger
fraud.detected.ait.v1XGBoost AIT pipeline emits a FraudDetection (score ≥ 0.85)
fraud.detected.ait_ring.v1Cohort job detects cross-tenant ring (≥ 5 tenants, ≥ 100 MSISDNs)
fraud.detected.simbox.v1SIM-box pattern detector emits detection
fraud.detected.simbox_network.v124h cluster of SIM-box detections share an ASN
fraud.detected.otp_harvesting.v1OTP-harvest cohort + revocation join breaches threshold
fraud.detected.otp_grinding.v1Per-MSISDN burst (>10 OTPs in ≤60s)
fraud.detected.greyroute.v1Grey-route arbitrage detector (24h window)
fraud.detected.dlr_anomaly.v1DLR success-rate >3σ from 7d baseline (feature event)
fraud.detected.dlr_uniformity.v1OTP DLR distribution stddev<100ms (artificial uniformity)
fraud.case.opened.v1Medium-confidence finding opens HITL case
fraud.case.decided.v1Analyst decides a case
fraud.case.auto_stale.v1Case >30d without decision
fraud.tenant_score.updated.v1Tenant tier crossed boundary in hourly recompute
fraud.model.promoted.v1New ModelVersion.status = ACTIVE
fraud.model.rolled_back.v1Active reverted to previous version
fraud.model.artifact.tamper.v1Artifact SHA-256 mismatch on load
fraud.feed.exported.v1Daily MISP export uploaded to MinIO and SFTP-mirrored
fraud.feed.heartbeat.v1Zero-diff export day; absence-of-evidence signal
fraud.feed.imported.v1MISP feed imported successfully
fraud.alert.feed.signature.invalid.v1Import rejected due to signature failure (PagerDuty)

5. Tenant Fraud Scoring Formula

The synchronous Score(scope=TENANT, id=tenantId) returns a FraudScore derived from the last 30 days of detections and signals for the tenant. Hourly recompute, with on-demand recompute available via REST.

ait_component = clip01(0.40 × max_score(detections WHERE category='AIT' AND age <= 30d))
ring_component = clip01(0.20 × max_score(detections WHERE category='AIT_RING' AND age <= 30d))
otp_component = clip01(0.20 × max_score(detections WHERE category IN ('OTP_HARVEST','OTP_GRINDING') AND age <= 30d))
greyroute_component = clip01(0.10 × max_score(detections WHERE category='GREY_ROUTE' AND age <= 30d))
imported_component = clip01(0.10 × max_indicator_match_score(tenantId, age <= 30d))

raw_score = ait_component + ring_component + otp_component + greyroute_component + imported_component
decay_score = raw_score × exp(-days_since_last_detection / 30)
fraud_score = clip01(decay_score)

tier:
fraud_score < 0.20 → SAFE
0.20 <= fraud_score < 0.50 → WATCH
0.50 <= fraud_score < 0.80 → RISKY
fraud_score >= 0.80 → HIGH_RISK
no signals in last 30d → PROBATION

The tier boundaries are deliberately offset from compliance-engine's risk tiers — compliance-engine consumes fraud.tenant_score.updated.v1 and maps the fraud tier into its contentScore/tenureScore inputs, rather than using fraud tier as a direct verdict.


6. Global Invariants

  • Fail-soft for detection. A missed 5-minute window is acceptable; the missed window is backfilled on the next run. Detection latency budget is 15 minutes for high-confidence AIT, not seconds.
  • Fail-closed-with-default for Score gRPC. If fraud-intel-service is unreachable, the caller (compliance-engine, routing-engine) treats the result as tier = PROBATION (neutral, not malicious). This avoids a fraud-intel outage cascading into a compliance freeze.
  • Append-only signals and detections. fraud.signals, fraud.detections, fraud.audit_log, and ClickHouse partitions are immutable. Retention is enforced by partition pruning, not by DELETE.
  • Provenance on every ML-derived event. aiProvenance.modelId, modelVersion, trainingSetHash, and featureSetHash are mandatory on every detection. Regulator dispute resolution requires the exact reproducible training set.
  • Separation of duties on cases. The actor that opened a case (whether system:auto or a human operator) cannot be the one to mark it CONFIRMED. Enforced by API guard.
  • MSISDN minimisation in events. Cross-context fraud events carry msisdnHash = sha256(msisdn + tenantSalt) rather than raw MSISDNs. Raw MSISDNs are restricted to the subject's own tenant scope.
  • Allowlist always wins. A subject on fraud.allowlists is never subject to enforcement; the detection is suppressed with enforcementStatus = SUPPRESSED and an audit-log entry is written. Allowlists are versioned and reviewed quarterly.
  • No cloud LLM for PII. Per AI_INTEGRATION §2, text-classification models run on-cluster via Triton/vLLM. SMS body content is never sent off-cluster. PII anonymisation runs pre-inference even on local models as defence-in-depth.
  • Model integrity on every load. A ModelVersion artifact whose downloaded SHA-256 does not match the registered hash is refused; the inference pod refuses to enter Ready state and emits a tamper event.