Fraud Intelligence Service — Domain Model
Version: 1.0 Status: Draft Owner: Trust and Safety Last Updated: 2026-04-21 Companion: SERVICE_OVERVIEW · APPLICATION_LOGIC · DATA_MODEL · EVENT_SCHEMAS · API_CONTRACTS · AI_INTEGRATION Related ADR: ADR-0004 National-Backbone Resilience §3 Reference taxonomy: GSMA FF.21 (A2P SMS Fraud Reference) · MEF MEF-W63 (Inter-Carrier Fraud) · MITRE ATT&CK Mobile T1660 (SMS Control)
1. Bounded Context
Trust & Safety / Fraud Intelligence. The fraud-intel-service owns the inferential detection plane: cross-message, cross-tenant, cross-MNO patterns that no single rule can express. It is a peer to compliance-engine (per-message policy) and sms-firewall-service (perimeter rules); it is not the enforcement plane — every detection is published as a NATS event consumed by other services to action.
The context boundary is drawn such that:
- Inside the boundary: raw fraud signals, engineered features, fraud detections (ML and graph), fraud cases, model registry and versioned artifacts, MISP/STIX indicators imported and exported, per-entity fraud scores, HITL feedback decisions, training-set provenance.
- Outside the boundary: message storage (
sms-orchestrator), DLR correlation primitives (dlr-processor), per-message verdicts (compliance-engine), perimeter rule enforcement (sms-firewall-service), sender-ID lifecycle (sender-id-registry-service), billing CDRs (cdr-mediation-service), tenant onboarding (tenant-service), regulator reporting (regulator-portal-service).
The service is asynchronous-by-design with one synchronous gRPC entry-point (Score) used by compliance-engine and routing-engine.
2. Aggregates
FraudSignal
A normalised, append-only record of a single piece of evidence projected from one of the upstream NATS streams (firewall.audit.v1, sms.events.status.v1, sms.dlr.inbound.v1, cdr.generated.v1, consent.revoked.v1). One signal is one row in fraud.signals and one row in the ClickHouse fraud_features.events columnar projection.
| Field | Type | Notes |
|---|---|---|
signalId | UUIDv4 | Identity (prefix fs_) |
eventTs | Instant | Event time as published |
sourceStream | enum SourceStream | FIREWALL_AUDIT · SMS_STATUS · SMS_DLR · CDR · CONSENT_REVOKED |
tenantId | UUIDv4 | null | Owning tenant (null for some MO/inbound) |
srcMsisdn | E.164 | null | Originating MSISDN (hashed in cold paths) |
dstMsisdn | E.164 | null | Destination MSISDN |
senderId | string | null | Sender ID for outbound |
mnoId | string | null | Terminating MNO code (AWCC, MTN, ROSHAN, ETISALAT, SALAAM) |
peerAsn | int | null | BGP ASN of inbound peer |
verdict | string | null | Firewall verdict if sourceStream = FIREWALL_AUDIT |
dlrStatus | string | null | SMPP status if sourceStream = SMS_DLR |
templateHash | sha256 | null | Normalised body template hash (digit-folded, NFC) |
payloadHash | sha256 | sha256(canonicalised event JSON) — dedup |
attemptCount | int | Per-message retry count |
isOtpLikely | boolean | Set by OTP-pattern matcher |
otpDestinationClass | enum | null | GENERIC · BANK · GOV · OPERATOR_INTERNAL |
ingestedAt | Instant | Time entered fraud-intel-service |
traceId | string | W3C trace context |
Invariants
- Append-only at the database level (Postgres rule rejects UPDATE/DELETE; ClickHouse table is
MergeTreewith TTL). payloadHashcollision within 5-minute dedup window suppresses duplicate ingestion.- A signal whose
sourceStreamdoes not match its declared shape is routed tofraud_features.events_dlqwithrejectReasonand the upstream NATS message is ACK'd to prevent replay storms.
FraudDetection
A high-confidence finding that some FraudCategory is occurring with confidence ≥ 0.85, produced by a model run or rule-based pattern matcher. Each detection is persisted, emitted as a NATS event, and tracked through its enforcement lifecycle.
| Field | Type | Notes |
|---|---|---|
detectionId | UUIDv4 | Identity (prefix fd_) |
category | FraudCategory VO | AIT / SIMBOX / OTP_HARVEST / OTP_GRINDING / GREY_ROUTE / SENDER_ID_ABUSE / DLR_UNIFORMITY / AIT_RING |
subjectScope | enum | TENANT · SENDER_ID · MSISDN · MSISDN_BLOCK · PEER_ASN · MSISDN_COHORT |
subjectId | string | Identifier within scope |
score | FraudScore VO | 0.0 .. 1.0 |
confidenceTier | enum | LOW (<0.6) · MEDIUM (0.6-0.85) · HIGH (≥0.85) |
evidence | JSONB | Window bounds, feature vector ref, contributing event IDs, SHAP top-3 |
aiProvenance | AiProvenance VO | { modelId, modelVersion, trainingSetHash, featureSetHash, runtimeMs } |
windowStart, windowEnd | Instant | Detection window |
sourceModelId | UUIDv4 | FK → MlModel |
sourcePipeline | enum | XGBOOST_AIT · GRAPHSAGE_COHORT · IFOREST_OUTLIER · RULE_PATTERN · STREAMING_BURST |
enforcementStatus | enum | EMITTED · CONSUMED · ENFORCED · SUPPRESSED · EXPIRED |
suppressionReason | string | null | If subject is on platform allowlist |
createdAt, expiresAt | Instant | Detection valid for expiresAt - createdAt |
Invariants
confidence ≥ 0.85is required for an aggregate to exist as aFraudDetection. Lower-confidence findings live asFraudCase(see below) or as raw model predictions in ClickHouse.- Every
FraudDetectioncarriesaiProvenanceso a regulator dispute can be defended against the exact model artifact that produced it. enforcementStatusis a one-way state machine:EMITTED → CONSUMED → ENFORCED(orSUPPRESSED/EXPIRED). No back-transitions.
FraudCase
A medium-confidence finding (0.6 ≤ score < 0.85) requiring human-in-the-loop adjudication by a Trust & Safety analyst. Cases are the workflow object behind the fraud-analyst REST surface.
| Field | Type | Notes |
|---|---|---|
caseId | UUIDv4 | Identity (prefix fc_) |
category | FraudCategory VO | Same enum as detections |
subjectScope, subjectId | (as above) | |
score | FraudScore VO | 0.6 ≤ score < 0.85 |
evidence | JSONB | Same shape as detection |
aiProvenance | AiProvenance VO | |
suggestedAction | enum | BLOCKLIST_MSISDN · QUARANTINE_MSISDN_BLOCK · SUSPEND_SENDER_ID · DEPEER_PEER_ASN · THROTTLE_TENANT · NO_ACTION |
status | enum | PENDING_REVIEW · IN_REVIEW · CONFIRMED · DISMISSED · REFINE_FEATURES · STALE |
assignedTo | UUIDv4 | null | userId of the assigned analyst |
openedAt | Instant | |
openedBy | enum/UUID | system:auto or userId |
decidedAt | Instant | null | |
decidedBy | UUIDv4 | null | userId |
reason | string | null | Analyst-supplied rationale |
actionExecuted | boolean | Whether the analyst dispatched the suggested action |
Invariants
case.openedBy != case.decidedBy(separation of duties) — enforced by API; same actor may not decide their own auto-opened case via UI of self-assignment.- A case with
openedAt < now() - 30d AND status IN ('PENDING_REVIEW','IN_REVIEW')auto-closes asSTALEand emitsfraud.case.auto_stale.v1. - Status transitions:
PENDING_REVIEW → IN_REVIEW → {CONFIRMED, DISMISSED, REFINE_FEATURES}or→ STALEfrom PENDING/IN_REVIEW. - A case marked
CONFIRMEDwithactionExecuted = trueMUST have produced a correspondingfraud.case.action_dispatched.v1event with the issued NATS subject and downstream service ack.
FraudFeed
The cross-MNO and regulator threat-intel exchange surface. One FraudFeed is one logical feed source (e.g. regulator-atra, peer-mnt-roshan, peer-awcc, internal-export). Indicators within a feed are versioned and decay.
| Field | Type | Notes |
|---|---|---|
feedId | UUIDv4 | Identity (prefix ff_) |
feedName | string | e.g. regulator-atra |
direction | enum | IMPORT · EXPORT |
format | enum | MISP_2_4 · STIX_2_1 |
transport | enum | HTTP_PUSH · SFTP_MIRROR · S3_BUCKET |
signaturePolicy | enum | HSM_REQUIRED · HSM_OPTIONAL · NONE_INTERNAL_ONLY |
publicKeyRef | string | null | Vault path to the peer's signing public key (import only) |
decayProfile | DecayProfile VO | { halfLifeDays, floorWeight } |
lastSyncAt | Instant | null | Last successful import/export |
nextRunAt | Instant | Cron next-run |
isActive | boolean |
FeedIndicator is the contained entity:
| Field | Type | Notes |
|---|---|---|
indicatorId | UUIDv4 | Identity |
feedId | UUIDv4 | FK |
sourceUuid | string | Upstream MISP/STIX UUID (idempotency) |
type | enum IndicatorType | MSISDN · SENDER_ID · ASN · MSISDN_BLOCK · TEMPLATE_HASH · URL · IP_CIDR |
value | text | Indicator value |
confidence | float (0..1) | As reported by source |
decayFactor | float (0..1) | Computed as 0.5 ^ ((now - importedAt) / halfLifeDays) |
tags | text[] | Free-form (e.g. simbox, aws-aggregator) |
importedAt | Instant | |
expiredAt | Instant | null |
Invariants
(feedId, sourceUuid)is unique — re-import is idempotent.- A feed with
signaturePolicy = HSM_REQUIREDMUST havepublicKeyRefpopulated; import is rejected otherwise. - An indicator with
decayFactor < decayProfile.floorWeightis excluded from feature-store joins (effectively expired).
FraudPattern
A rule-based (non-ML) pattern definition for cases where ML is over-engineered or where regulator mandates a deterministic rule (e.g. an explicit MSISDN block). Patterns produce FraudDetection rows the same way ML pipelines do, but the sourcePipeline = RULE_PATTERN and aiProvenance.modelId = 'rule:<patternId>'.
| Field | Type | Notes |
|---|---|---|
patternId | UUIDv4 | Identity (prefix fp_) |
name | string | Human-readable |
category | FraudCategory VO | |
predicate | JSONB | Discriminated per category (e.g. { kind: 'MSISDN_BLOCK_LIST', value: '+9379000XXXX' }) |
confidence | float | Static confidence assigned to all matches |
isActive | boolean | |
version | int | Bumped on edit |
createdBy, updatedBy | UUIDv4 |
MlModel and ModelVersion
The model registry. MlModel is the logical model (one per (category, pipeline)); ModelVersion is the immutable artifact-pointer with provenance.
MlModel:
| Field | Type | Notes |
|---|---|---|
modelId | UUIDv4 | Identity (prefix ml_) |
category | FraudCategory VO | AIT · SIMBOX · OTP_HARVEST · GREY_ROUTE |
pipeline | enum | XGBOOST · GRAPHSAGE · IFOREST · TRANSFORMER_TEXT |
description | string | |
activeVersionId | UUIDv4 | null | FK → ModelVersion; exactly one active per (category, pipeline) |
shadowVersionId | UUIDv4 | null | FK → ModelVersion; at most one shadow at a time |
createdAt, updatedAt | Instant |
ModelVersion:
| Field | Type | Notes |
|---|---|---|
versionId | UUIDv4 | Identity (prefix mv_) |
modelId | UUIDv4 | FK |
version | semver | e.g. 2.1.4 |
artifactUri | string | MinIO s3://fraud-models/<modelId>/<version>.tar.gz |
artifactSha256 | sha256 | Integrity check at load time |
cosignSignature | string | null | Optional Sigstore/cosign chain |
trainingSetHash | sha256 | Provenance — exact training rows, computed pre-fit |
featureSetHash | sha256 | Feature contract hash (column names + encoders) |
evaluationMetrics | JSONB | { auc, f1, precision, recall, fprAtThreshold, calibration: {...} } |
modelCardUri | string | MinIO YAML model card |
status | enum | REGISTERED · SHADOW · ACTIVE · RETIRED · REJECTED |
registeredAt | Instant | |
promotedAt | Instant | null | |
retiredAt | Instant | null | |
registeredBy, promotedBy | UUIDv4 |
Invariants
- For each
(MlModel.category, MlModel.pipeline)exactly oneModelVersion.status = ACTIVE. Enforced by partial unique index. - A
ModelVersioncannot be deleted;RETIREDis the only end state. - Promotion requires
status = SHADOWfor ≥ 24 h withevaluationMetrics.auc > activeVersion.evaluationMetrics.aucANDevaluationMetrics.calibration.brier ≤ activeVersion.evaluationMetrics.calibration.brier × 1.05. Otherwise412 SHADOW_EVAL_INSUFFICIENT. - Artifact load verifies
sha256(downloaded) == artifactSha256; mismatch → service refuses to register the version active and emitsfraud.model.artifact.tamper.v1to SOC.
FeedbackDecision
The HITL signal — a Trust & Safety analyst's decision on a FraudCase. Forms the labelled training corpus for the next retrain cycle.
| Field | Type | Notes |
|---|---|---|
decisionId | UUIDv4 | Identity |
caseId | UUIDv4 | FK |
decision | enum | CONFIRM_FRAUD · DISMISS · REFINE_FEATURES |
reason | string | Analyst-supplied (mandatory ≥ 20 chars) |
decidedAt | Instant | |
decidedBy | UUIDv4 | |
featureCorrections | JSONB | null | If REFINE_FEATURES: corrected feature values for re-engineering |
3. Value Objects
| VO | Shape | Invariants |
|---|---|---|
FraudCategory | enum: AIT · AIT_RING · SIMBOX · SIMBOX_NETWORK · OTP_HARVEST · OTP_GRINDING · GREY_ROUTE · SENDER_ID_ABUSE · DLR_UNIFORMITY · PHISHING · SPAM | Aligned with GSMA FF.21 fraud taxonomy |
FraudScore | float in [0.0, 1.0] | Calibrated via Platt scaling per model; comparable across categories |
FraudTier | SAFE (≥0.85 in safe direction) · WATCH · RISKY · HIGH_RISK · PROBATION (no data) | Used by Score gRPC consumers |
Confidence | float in [0.0, 1.0] | Model-reported posterior probability |
AiProvenance | { modelId, modelVersion, trainingSetHash, featureSetHash, shapTop3, runtimeMs } | All fields mandatory for ML-derived detections |
DecayProfile | { halfLifeDays: int, floorWeight: float } | Default { 30, 0.05 } |
EnforcementSubject | (scope, id) tuple | scope ∈ {TENANT, SENDER_ID, MSISDN, MSISDN_BLOCK, PEER_ASN, MSISDN_COHORT} |
WindowSpec | { size: Duration, slide: Duration } | e.g. {5m, 5m} for AIT, {30m, 5m} for SIM-box |
IndicatorType | enum (see FeedIndicator) | Maps directly to MISP attribute types and STIX 2.1 indicator pattern objects |
MsisdnHash | sha256(msisdn + tenantSalt) | Used in events to avoid leaking subscriber MSISDNs cross-context |
4. Domain Events (produced)
Detailed schemas in EVENT_SCHEMAS.md.
| Event | Trigger |
|---|---|
fraud.detected.ait.v1 | XGBoost AIT pipeline emits a FraudDetection (score ≥ 0.85) |
fraud.detected.ait_ring.v1 | Cohort job detects cross-tenant ring (≥ 5 tenants, ≥ 100 MSISDNs) |
fraud.detected.simbox.v1 | SIM-box pattern detector emits detection |
fraud.detected.simbox_network.v1 | 24h cluster of SIM-box detections share an ASN |
fraud.detected.otp_harvesting.v1 | OTP-harvest cohort + revocation join breaches threshold |
fraud.detected.otp_grinding.v1 | Per-MSISDN burst (>10 OTPs in ≤60s) |
fraud.detected.greyroute.v1 | Grey-route arbitrage detector (24h window) |
fraud.detected.dlr_anomaly.v1 | DLR success-rate >3σ from 7d baseline (feature event) |
fraud.detected.dlr_uniformity.v1 | OTP DLR distribution stddev<100ms (artificial uniformity) |
fraud.case.opened.v1 | Medium-confidence finding opens HITL case |
fraud.case.decided.v1 | Analyst decides a case |
fraud.case.auto_stale.v1 | Case >30d without decision |
fraud.tenant_score.updated.v1 | Tenant tier crossed boundary in hourly recompute |
fraud.model.promoted.v1 | New ModelVersion.status = ACTIVE |
fraud.model.rolled_back.v1 | Active reverted to previous version |
fraud.model.artifact.tamper.v1 | Artifact SHA-256 mismatch on load |
fraud.feed.exported.v1 | Daily MISP export uploaded to MinIO and SFTP-mirrored |
fraud.feed.heartbeat.v1 | Zero-diff export day; absence-of-evidence signal |
fraud.feed.imported.v1 | MISP feed imported successfully |
fraud.alert.feed.signature.invalid.v1 | Import rejected due to signature failure (PagerDuty) |
5. Tenant Fraud Scoring Formula
The synchronous Score(scope=TENANT, id=tenantId) returns a FraudScore derived from the last 30 days of detections and signals for the tenant. Hourly recompute, with on-demand recompute available via REST.
ait_component = clip01(0.40 × max_score(detections WHERE category='AIT' AND age <= 30d))
ring_component = clip01(0.20 × max_score(detections WHERE category='AIT_RING' AND age <= 30d))
otp_component = clip01(0.20 × max_score(detections WHERE category IN ('OTP_HARVEST','OTP_GRINDING') AND age <= 30d))
greyroute_component = clip01(0.10 × max_score(detections WHERE category='GREY_ROUTE' AND age <= 30d))
imported_component = clip01(0.10 × max_indicator_match_score(tenantId, age <= 30d))
raw_score = ait_component + ring_component + otp_component + greyroute_component + imported_component
decay_score = raw_score × exp(-days_since_last_detection / 30)
fraud_score = clip01(decay_score)
tier:
fraud_score < 0.20 → SAFE
0.20 <= fraud_score < 0.50 → WATCH
0.50 <= fraud_score < 0.80 → RISKY
fraud_score >= 0.80 → HIGH_RISK
no signals in last 30d → PROBATION
The tier boundaries are deliberately offset from compliance-engine's risk tiers — compliance-engine consumes fraud.tenant_score.updated.v1 and maps the fraud tier into its contentScore/tenureScore inputs, rather than using fraud tier as a direct verdict.
6. Global Invariants
- Fail-soft for detection. A missed 5-minute window is acceptable; the missed window is backfilled on the next run. Detection latency budget is 15 minutes for high-confidence AIT, not seconds.
- Fail-closed-with-default for
ScoregRPC. If fraud-intel-service is unreachable, the caller (compliance-engine,routing-engine) treats the result astier = PROBATION(neutral, not malicious). This avoids a fraud-intel outage cascading into a compliance freeze. - Append-only signals and detections.
fraud.signals,fraud.detections,fraud.audit_log, and ClickHouse partitions are immutable. Retention is enforced by partition pruning, not byDELETE. - Provenance on every ML-derived event.
aiProvenance.modelId,modelVersion,trainingSetHash, andfeatureSetHashare mandatory on every detection. Regulator dispute resolution requires the exact reproducible training set. - Separation of duties on cases. The actor that opened a case (whether
system:autoor a human operator) cannot be the one to mark itCONFIRMED. Enforced by API guard. - MSISDN minimisation in events. Cross-context fraud events carry
msisdnHash = sha256(msisdn + tenantSalt)rather than raw MSISDNs. Raw MSISDNs are restricted to the subject's own tenant scope. - Allowlist always wins. A subject on
fraud.allowlistsis never subject to enforcement; the detection is suppressed withenforcementStatus = SUPPRESSEDand an audit-log entry is written. Allowlists are versioned and reviewed quarterly. - No cloud LLM for PII. Per AI_INTEGRATION §2, text-classification models run on-cluster via Triton/vLLM. SMS body content is never sent off-cluster. PII anonymisation runs pre-inference even on local models as defence-in-depth.
- Model integrity on every load. A
ModelVersionartifact whose downloaded SHA-256 does not match the registered hash is refused; the inference pod refuses to enterReadystate and emits a tamper event.