Fraud Intelligence Service — Data Model
Version: 1.0 Status: Draft Owner: Trust and Safety Last Updated: 2026-04-21 Companion: DOMAIN_MODEL · SECURITY_MODEL
The service uses two storage tiers:
- PostgreSQL (
fraudschema) — Owned exclusively by fraud-intel-service. Holds the registry, case-management workflow, model catalog, MISP indicators, allowlists, audit log, outbox. Online-transactional, low-volume relative to the feature store. - ClickHouse (
fraud_featuresschema) — Owned by fraud-intel-service, operated by SRE. Holds the columnar feature store, raw signals, model predictions, training-set snapshots. High-volume (10 M events/h target), monthly partitioned.
No other service reads or writes either schema directly — integration is via gRPC (Score), HTTP REST (admin), or NATS events.
1. PostgreSQL fraud schema (DDL)
-- =====================================================================
-- ENUM TYPES
-- =====================================================================
CREATE TYPE fraud.category AS ENUM (
'AIT','AIT_RING','SIMBOX','SIMBOX_NETWORK','OTP_HARVEST','OTP_GRINDING',
'GREY_ROUTE','SENDER_ID_ABUSE','DLR_UNIFORMITY','PHISHING','SPAM'
);
CREATE TYPE fraud.subject_scope AS ENUM (
'TENANT','SENDER_ID','MSISDN','MSISDN_BLOCK','PEER_ASN','MSISDN_COHORT'
);
CREATE TYPE fraud.confidence_tier AS ENUM ('LOW','MEDIUM','HIGH');
CREATE TYPE fraud.tier AS ENUM ('SAFE','WATCH','RISKY','HIGH_RISK','PROBATION');
CREATE TYPE fraud.case_status AS ENUM (
'PENDING_REVIEW','IN_REVIEW','CONFIRMED','DISMISSED','REFINE_FEATURES','STALE'
);
CREATE TYPE fraud.case_decision AS ENUM ('CONFIRM_FRAUD','DISMISS','REFINE_FEATURES');
CREATE TYPE fraud.suggested_action AS ENUM (
'BLOCKLIST_MSISDN','QUARANTINE_MSISDN_BLOCK','SUSPEND_SENDER_ID',
'DEPEER_PEER_ASN','THROTTLE_TENANT','BLOCKLIST_COHORT','NO_ACTION'
);
CREATE TYPE fraud.enforcement_status AS ENUM (
'EMITTED','CONSUMED','ENFORCED','SUPPRESSED','EXPIRED'
);
CREATE TYPE fraud.model_version_status AS ENUM (
'REGISTERED','SHADOW','ACTIVE','RETIRED','REJECTED'
);
CREATE TYPE fraud.feed_direction AS ENUM ('IMPORT','EXPORT');
CREATE TYPE fraud.feed_format AS ENUM ('MISP_2_4','STIX_2_1');
CREATE TYPE fraud.indicator_type AS ENUM (
'MSISDN','SENDER_ID','ASN','MSISDN_BLOCK','TEMPLATE_HASH','URL','IP_CIDR'
);
-- =====================================================================
-- SIGNALS (write-through projection of ClickHouse hot path)
-- =====================================================================
CREATE TABLE fraud.signals (
signal_id UUID NOT NULL,
event_ts TIMESTAMPTZ NOT NULL,
source_stream TEXT NOT NULL,
tenant_id UUID,
src_msisdn TEXT,
dst_msisdn TEXT,
sender_id TEXT,
mno_id TEXT,
peer_asn INT,
verdict TEXT,
dlr_status TEXT,
template_hash TEXT,
payload_hash TEXT NOT NULL,
attempt_count INT NOT NULL DEFAULT 1,
is_otp_likely BOOLEAN NOT NULL DEFAULT FALSE,
otp_destination_class TEXT,
ingested_at TIMESTAMPTZ NOT NULL DEFAULT now(),
trace_id TEXT,
PRIMARY KEY (event_ts, signal_id)
) PARTITION BY RANGE (event_ts);
-- Monthly partitions provisioned 3 months ahead by maintenance worker.
CREATE INDEX ix_sig_tenant_ts ON fraud.signals (tenant_id, event_ts DESC);
CREATE INDEX ix_sig_src_msisdn ON fraud.signals (src_msisdn) WHERE src_msisdn IS NOT NULL;
CREATE INDEX ix_sig_dst_msisdn ON fraud.signals (dst_msisdn) WHERE dst_msisdn IS NOT NULL;
CREATE INDEX ix_sig_payload_hash ON fraud.signals (payload_hash, event_ts DESC);
-- =====================================================================
-- DETECTIONS (high-confidence findings)
-- =====================================================================
CREATE TABLE fraud.detections (
detection_id UUID PRIMARY KEY,
category fraud.category NOT NULL,
subject_scope fraud.subject_scope NOT NULL,
subject_id TEXT NOT NULL,
score NUMERIC(4,3) NOT NULL CHECK (score >= 0 AND score <= 1),
confidence_tier fraud.confidence_tier NOT NULL,
evidence JSONB NOT NULL,
ai_provenance JSONB NOT NULL,
window_start TIMESTAMPTZ NOT NULL,
window_end TIMESTAMPTZ NOT NULL,
source_model_id UUID,
source_pipeline TEXT NOT NULL,
enforcement_status fraud.enforcement_status NOT NULL DEFAULT 'EMITTED',
suppression_reason TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ
);
CREATE INDEX ix_det_subject ON fraud.detections (subject_scope, subject_id, created_at DESC);
CREATE INDEX ix_det_category_ts ON fraud.detections (category, created_at DESC);
CREATE INDEX ix_det_active ON fraud.detections (enforcement_status, expires_at)
WHERE enforcement_status NOT IN ('EXPIRED','SUPPRESSED');
-- =====================================================================
-- CASES (HITL workflow)
-- =====================================================================
CREATE TABLE fraud.cases (
case_id UUID PRIMARY KEY,
category fraud.category NOT NULL,
subject_scope fraud.subject_scope NOT NULL,
subject_id TEXT NOT NULL,
score NUMERIC(4,3) NOT NULL CHECK (score >= 0.6 AND score < 0.85),
evidence JSONB NOT NULL,
ai_provenance JSONB NOT NULL,
suggested_action fraud.suggested_action NOT NULL DEFAULT 'NO_ACTION',
status fraud.case_status NOT NULL DEFAULT 'PENDING_REVIEW',
assigned_to UUID,
opened_at TIMESTAMPTZ NOT NULL DEFAULT now(),
opened_by TEXT NOT NULL, -- 'system:auto' or userId
decided_at TIMESTAMPTZ,
decided_by UUID,
decision fraud.case_decision,
reason TEXT,
action_executed BOOLEAN NOT NULL DEFAULT FALSE,
CHECK (decided_by IS NULL OR opened_by != decided_by::text)
);
CREATE INDEX ix_case_status_priority ON fraud.cases (status, opened_at)
WHERE status IN ('PENDING_REVIEW','IN_REVIEW');
CREATE INDEX ix_case_assigned ON fraud.cases (assigned_to) WHERE assigned_to IS NOT NULL;
CREATE INDEX ix_case_subject ON fraud.cases (subject_scope, subject_id);
CREATE TABLE fraud.case_decisions (
decision_id UUID PRIMARY KEY,
case_id UUID NOT NULL REFERENCES fraud.cases(case_id),
decision fraud.case_decision NOT NULL,
reason TEXT NOT NULL CHECK (length(reason) >= 20),
feature_corrections JSONB,
decided_at TIMESTAMPTZ NOT NULL DEFAULT now(),
decided_by UUID NOT NULL
);
CREATE INDEX ix_decisions_case ON fraud.case_decisions (case_id);
CREATE INDEX ix_decisions_decided ON fraud.case_decisions (decided_at);
-- =====================================================================
-- MODEL REGISTRY
-- =====================================================================
CREATE TABLE fraud.models (
model_id UUID PRIMARY KEY,
category fraud.category NOT NULL,
pipeline TEXT NOT NULL, -- XGBOOST / GRAPHSAGE / IFOREST / TRANSFORMER_TEXT
description TEXT,
active_version_id UUID,
shadow_version_id UUID,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ux_models_active_per_category
ON fraud.models (category, pipeline)
WHERE active_version_id IS NOT NULL;
CREATE TABLE fraud.model_versions (
version_id UUID PRIMARY KEY,
model_id UUID NOT NULL REFERENCES fraud.models(model_id),
version TEXT NOT NULL, -- semver
artifact_uri TEXT NOT NULL, -- s3://...
artifact_sha256 TEXT NOT NULL,
cosign_signature TEXT,
training_set_hash TEXT NOT NULL,
feature_set_hash TEXT NOT NULL,
evaluation_metrics JSONB NOT NULL, -- { auc, f1, precision, recall, brier, fairness:{...} }
model_card_uri TEXT NOT NULL,
status fraud.model_version_status NOT NULL DEFAULT 'REGISTERED',
registered_at TIMESTAMPTZ NOT NULL DEFAULT now(),
registered_by UUID NOT NULL,
promoted_at TIMESTAMPTZ,
promoted_by UUID,
retired_at TIMESTAMPTZ,
UNIQUE (model_id, version)
);
CREATE INDEX ix_mv_model_status ON fraud.model_versions (model_id, status);
CREATE TABLE fraud.model_evaluations (
eval_id UUID PRIMARY KEY,
version_id UUID NOT NULL REFERENCES fraud.model_versions(version_id),
evaluated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
test_set_hash TEXT NOT NULL,
metrics JSONB NOT NULL, -- per-fold and aggregate
evaluation_window TEXT, -- 'shadow_24h' | 'training_holdout'
decision TEXT, -- 'PASS' | 'FAIL' | 'INCONCLUSIVE'
decision_reason TEXT
);
CREATE INDEX ix_meval_version ON fraud.model_evaluations (version_id, evaluated_at DESC);
-- =====================================================================
-- MISP / STIX FEEDS
-- =====================================================================
CREATE TABLE fraud.feeds (
feed_id UUID PRIMARY KEY,
feed_name TEXT NOT NULL UNIQUE,
direction fraud.feed_direction NOT NULL,
format fraud.feed_format NOT NULL,
transport TEXT NOT NULL, -- HTTP_PUSH / SFTP_MIRROR / S3_BUCKET
signature_policy TEXT NOT NULL, -- HSM_REQUIRED / HSM_OPTIONAL / NONE_INTERNAL_ONLY
public_key_ref TEXT, -- vault://path
decay_profile JSONB NOT NULL DEFAULT '{"halfLifeDays":30,"floorWeight":0.05}',
schedule_cron TEXT,
last_sync_at TIMESTAMPTZ,
next_run_at TIMESTAMPTZ,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
created_by UUID
);
CREATE TABLE fraud.feed_indicators (
indicator_id UUID PRIMARY KEY,
feed_id UUID NOT NULL REFERENCES fraud.feeds(feed_id),
source_uuid TEXT NOT NULL,
type fraud.indicator_type NOT NULL,
value TEXT NOT NULL,
confidence NUMERIC(4,3) NOT NULL DEFAULT 1.000,
decay_factor NUMERIC(4,3) NOT NULL DEFAULT 1.000,
tags TEXT[],
imported_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expired_at TIMESTAMPTZ,
UNIQUE (feed_id, source_uuid)
);
CREATE INDEX ix_fi_value ON fraud.feed_indicators (type, value);
CREATE INDEX ix_fi_active ON fraud.feed_indicators (feed_id, imported_at DESC)
WHERE expired_at IS NULL;
CREATE TABLE fraud.feed_imports (
import_id UUID PRIMARY KEY,
feed_id UUID NOT NULL REFERENCES fraud.feeds(feed_id),
signature_valid BOOLEAN NOT NULL,
signature_key_id TEXT,
added_count INT NOT NULL DEFAULT 0,
updated_count INT NOT NULL DEFAULT 0,
expired_count INT NOT NULL DEFAULT 0,
payload_sha256 TEXT NOT NULL,
imported_at TIMESTAMPTZ NOT NULL DEFAULT now(),
imported_by TEXT
);
-- =====================================================================
-- FEEDBACK & ALLOWLIST
-- =====================================================================
CREATE TABLE fraud.allowlists (
allowlist_id UUID PRIMARY KEY,
scope fraud.subject_scope NOT NULL,
value TEXT NOT NULL,
reason TEXT NOT NULL,
added_by UUID NOT NULL,
approved_by UUID NOT NULL, -- two-person rule
added_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ,
UNIQUE (scope, value)
);
CREATE TABLE fraud.cohort_exclusions (
exclusion_id UUID PRIMARY KEY,
cohort_hash TEXT NOT NULL UNIQUE,
reason TEXT NOT NULL,
added_by UUID NOT NULL,
added_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE fraud.template_allowlist (
template_hash TEXT PRIMARY KEY,
description TEXT NOT NULL,
added_by UUID NOT NULL,
added_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE fraud.feature_imputation_policy (
feature_name TEXT PRIMARY KEY,
strategy TEXT NOT NULL, -- ZERO / MEAN_24H / MEDIAN_7D / CONSTANT
constant_value NUMERIC,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE fraud.otp_patterns (
pattern_id UUID PRIMARY KEY,
language TEXT NOT NULL, -- ISO 639-1
regex TEXT NOT NULL,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
version INT NOT NULL DEFAULT 1,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- =====================================================================
-- ENTITY SCORES (Score gRPC L2)
-- =====================================================================
CREATE TABLE fraud.entity_scores (
scope fraud.subject_scope NOT NULL,
subject_id TEXT NOT NULL,
score NUMERIC(4,3) NOT NULL,
tier fraud.tier NOT NULL,
contributing_factors JSONB NOT NULL DEFAULT '[]',
model_versions JSONB NOT NULL DEFAULT '{}',
computed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (scope, subject_id)
);
CREATE INDEX ix_scores_tier ON fraud.entity_scores (tier);
CREATE INDEX ix_scores_computed ON fraud.entity_scores (computed_at);
CREATE TABLE fraud.entity_score_history (
scope fraud.subject_scope NOT NULL,
subject_id TEXT NOT NULL,
computed_at TIMESTAMPTZ NOT NULL,
score NUMERIC(4,3) NOT NULL,
tier fraud.tier NOT NULL,
contributing_factors JSONB NOT NULL,
PRIMARY KEY (computed_at, scope, subject_id)
) PARTITION BY RANGE (computed_at);
-- =====================================================================
-- AUDIT LOG (append-only)
-- =====================================================================
CREATE TYPE fraud.audit_entity AS ENUM (
'CASE','MODEL','MODEL_VERSION','FEED','ALLOWLIST','PATTERN','TENANT_SCORE','OUTBOX'
);
CREATE TYPE fraud.audit_action AS ENUM (
'CREATE','UPDATE','DELETE','DECIDE','PROMOTE','ROLLBACK','IMPORT','EXPORT','SUPPRESS','OVERRIDE'
);
CREATE TABLE fraud.audit_log (
audit_id UUID NOT NULL,
entity_type fraud.audit_entity NOT NULL,
entity_id TEXT NOT NULL,
action fraud.audit_action NOT NULL,
actor_user_id UUID, -- null = system
before JSONB,
after JSONB,
ip INET,
user_agent TEXT,
trace_id TEXT,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (occurred_at, audit_id)
) PARTITION BY RANGE (occurred_at);
CREATE INDEX ix_audit_entity ON fraud.audit_log (entity_type, entity_id, occurred_at DESC);
CREATE INDEX ix_audit_actor ON fraud.audit_log (actor_user_id, occurred_at DESC);
-- =====================================================================
-- OUTBOX
-- =====================================================================
CREATE TABLE fraud.outbox (
event_id UUID PRIMARY KEY,
subject TEXT NOT NULL,
payload JSONB NOT NULL,
published_at TIMESTAMPTZ,
attempts INT NOT NULL DEFAULT 0,
last_error TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ix_outbox_unpublished ON fraud.outbox (created_at) WHERE published_at IS NULL;
2. ClickHouse fraud_features schema (DDL)
-- Distributed cluster `fraud_cluster` (3 shards × 2 replicas)
CREATE TABLE fraud_features.events_local ON CLUSTER fraud_cluster
(
event_ts DateTime64(3, 'UTC'),
signal_id UUID,
source_stream LowCardinality(String),
tenant_id Nullable(UUID),
src_msisdn Nullable(String),
dst_msisdn Nullable(String),
sender_id Nullable(String),
mno_id LowCardinality(Nullable(String)),
peer_asn Nullable(UInt32),
verdict LowCardinality(Nullable(String)),
dlr_status LowCardinality(Nullable(String)),
template_hash Nullable(String),
payload_hash String,
segments UInt8 DEFAULT 1,
attempt_count UInt8 DEFAULT 1,
is_otp_likely UInt8 DEFAULT 0,
otp_destination_class LowCardinality(Nullable(String)),
imported_indicator_msisdn UInt8 DEFAULT 0,
imported_indicator_template UInt8 DEFAULT 0,
ingested_at DateTime64(3,'UTC') DEFAULT now64(3),
trace_id String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/fraud_events', '{replica}')
PARTITION BY toYYYYMM(event_ts)
ORDER BY (tenant_id, event_ts, signal_id)
TTL event_ts + INTERVAL 90 DAY DELETE;
CREATE TABLE fraud_features.events ON CLUSTER fraud_cluster
AS fraud_features.events_local
ENGINE = Distributed('fraud_cluster','fraud_features','events_local', cityHash64(tenant_id));
CREATE TABLE fraud_features.events_dlq
( event_ts DateTime64(3), source_stream LowCardinality(String),
raw String, reject_reason String )
ENGINE = MergeTree() ORDER BY event_ts TTL event_ts + INTERVAL 14 DAY;
-- AIT pipeline outputs
CREATE TABLE fraud_features.ait_window_features ON CLUSTER fraud_cluster
( window_start DateTime,
tenant_id UUID, dst_mno LowCardinality(String), sender_id String,
submit_count UInt64,
dlr_delivered_count UInt64, dlr_failed_count UInt64,
dlr_success_rate Float32,
unique_dst_msisdns UInt64,
mean_segments_per_msg Float32,
entropy_of_dst_prefix Float32,
unique_sender_ids UInt32,
repeated_body_ratio Float32,
peer_asn_diversity UInt16,
cohort_anomaly_score Nullable(Float32) )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/ait_window','{replica}')
PARTITION BY toYYYYMM(window_start)
ORDER BY (window_start, tenant_id, dst_mno, sender_id)
TTL window_start + INTERVAL 60 DAY;
CREATE TABLE fraud_features.ait_predictions ON CLUSTER fraud_cluster
( window_start DateTime, tenant_id UUID, dst_mno String, sender_id String,
score Float32, model_id UUID, model_version String,
shap_top3 String, predicted_at DateTime DEFAULT now() )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/ait_pred','{replica}')
PARTITION BY toYYYYMM(window_start)
ORDER BY (window_start, tenant_id);
CREATE TABLE fraud_features.ait_cohorts ON CLUSTER fraud_cluster
( cohort_hash UInt64,
dst_msisdn_count UInt32,
contributing_tenants UInt16,
contributing_sender_ids UInt16,
cohort_anomaly_score Float32,
first_seen_ts DateTime, last_seen_ts DateTime )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/ait_cohorts','{replica}')
ORDER BY cohort_hash
TTL last_seen_ts + INTERVAL 30 DAY;
CREATE TABLE fraud_features.shadow_predictions ON CLUSTER fraud_cluster
( window_start DateTime, version_id UUID,
subject_scope String, subject_id String,
score Float32, predicted_at DateTime DEFAULT now() )
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/shadow','{replica}')
PARTITION BY toYYYYMM(window_start)
ORDER BY (version_id, window_start, subject_id)
TTL window_start + INTERVAL 30 DAY;
-- DLR baseline
CREATE MATERIALIZED VIEW fraud_features.dlr_baseline_7d
ENGINE = AggregatingMergeTree()
PARTITION BY tenant_id
ORDER BY (tenant_id, dst_mno)
AS SELECT
tenant_id,
mno_id AS dst_mno,
avgState(toFloat32(dlr_status='DELIVERED')) AS mean_state,
varSampState(toFloat32(dlr_status='DELIVERED')) AS var_state
FROM fraud_features.events
WHERE event_ts >= now() - INTERVAL 7 DAY
AND source_stream = 'SMS_DLR'
GROUP BY tenant_id, dst_mno;
3. Redis key namespaces
All keys namespaced fraud:…. Tenant-scoped keys encode {tenantId} for per-tenant invalidation.
| Key | TTL | Purpose |
|---|---|---|
fraud:score:{scope}:{id} | 900 s (15 min) | L1 cache for Score gRPC |
fraud:graph:cohort:{cohortHash} | 3600 s | Cohort metadata cache |
fraud:simbox:{msisdnBlock} | 86400 s | SIM-box detection cooldown to prevent re-emission |
fraud:otp:dst:{msisdnHash}:60s | 60 s (sliding) | Sorted set of OTP timestamps for grinding detector |
fraud:throttle:dst:{msisdnHash} | 21600 s (6 h) | Throttle handle published with grinding detection |
fraud:imported:msisdn:bloom | no TTL | Bloom filter of imported MISP MSISDN indicators |
fraud:lock:ait:5m | 300 s | Distributed lock for AIT pipeline |
fraud:lock:cohort:1h | 3600 s | Distributed lock for cohort job |
fraud:lock:simbox:30m | 1800 s | Distributed lock for SIM-box pipeline |
fraud:lock:greyroute:1h | 3600 s | Distributed lock for grey-route pipeline |
fraud:lock:otp_harvest:30m | 1800 s | Distributed lock for OTP-harvest pipeline |
fraud:lock:scoring:hourly | 3600 s | Distributed lock for hourly score recompute |
fraud:model:active:{category}:{pipeline} | 60 s | Cached pointer to active ModelVersion (worker hot-reload) |
fraud:feature:imputation:policy | 300 s | Feature imputation policy mirror |
fraud:otp:patterns:active | 300 s | OTP regex set version marker |
fraud:case:queue:{category}:size | no TTL | Counter of pending cases per category |
4. Row-Level Security & append-only protection
4.1 Append-only on signals, audit_log
CREATE RULE signals_no_update AS ON UPDATE TO fraud.signals DO INSTEAD NOTHING;
CREATE RULE signals_no_delete AS ON DELETE TO fraud.signals DO INSTEAD NOTHING;
CREATE RULE audit_log_no_update AS ON UPDATE TO fraud.audit_log DO INSTEAD NOTHING;
CREATE RULE audit_log_no_delete AS ON DELETE TO fraud.audit_log DO INSTEAD NOTHING;
Retention is enforced by partition pruning, not by DELETE.
4.2 RLS on per-tenant scoring read
ALTER TABLE fraud.entity_scores ENABLE ROW LEVEL SECURITY;
CREATE POLICY scores_tenant_read ON fraud.entity_scores
FOR SELECT
USING (
scope = 'TENANT'
AND subject_id = current_setting('app.current_tenant_id', TRUE)
);
CREATE POLICY scores_admin_all ON fraud.entity_scores
USING (current_setting('app.caller_role') IN (
'tns-fraud-analyst','tns-fraud-analyst-lead',
'noc-operator','platform.compliance.admin','platform.auditor'
));
The tenant portal does not currently expose fraud scores directly (cross-tenant data sensitivity); the policy is scaffolding for future opt-in.
5. ID prefixes
| Prefix | Entity |
|---|---|
fs_ | FraudSignal |
fd_ | FraudDetection |
fc_ | FraudCase |
ff_ | FraudFeed |
fi_ | FeedIndicator |
fp_ | FraudPattern |
ml_ | MlModel |
mv_ | ModelVersion |
me_ | ModelEvaluation |
al_ | AuditLog |
aw_ | AllowlistEntry |
ot_ | OtpPattern |
tr_ | TrainingRun |
Internally UUIDv4; prefixes apply to externally exposed IDs in REST responses.
6. PII, encryption & retention
| Field | Class | Protection |
|---|---|---|
signals.src_msisdn, signals.dst_msisdn | CONFIDENTIAL | DB + disk encryption; API masking; RLS-gated |
signals.template_hash | INTERNAL | Plain |
events.dst_msisdn (ClickHouse) | CONFIDENTIAL | Disk-level encryption (LUKS); cluster network is intra-VPC only |
feed_indicators.value (when MSISDN) | CONFIDENTIAL | Same |
cases.evidence | INTERNAL (no body) | Plain (analyst-visible) |
audit_log.before/after | CONFIDENTIAL (may include MSISDN) | Same |
model_versions.training_set_hash | INTERNAL | Plain |
Encryption. No field-level envelope encryption is required for fraud data (no SMS body is ever stored; MSISDNs are pseudonymous). All data is protected by DB + disk encryption and tenant-isolated network controls.
Retention.
fraud.signals: 90 days (Postgres partition pruning) + ClickHouse 90 days hot. Cold archive to S3 with object-lock for 13 months.fraud.detections: 365 days hot, then archived.fraud.cases: 7 years (regulatory dispute window).fraud.audit_log: 13 months minimum, regulator-mandated.fraud.feed_indicators: indefinite (decay-weighted; effective expiry whendecay_factor < floor).fraud_features.events_dlq: 14 days.fraud_features.shadow_predictions: 30 days.
7. Migration strategy
- Partitions are provisioned 3 months ahead by
PartitionMaintenanceWorker(daily 03:00 UTC). Missing future partitions triggerFraudPartitionMissingHIGH alert. - ClickHouse retention enforced by
TTLclauses; verified daily byclickhouse-backupintegrity check. - Schema changes follow expand/contract; all migrations forward-only, reviewed by Security for any column that holds MSISDN.
- Cross-region ClickHouse replication uses
Replicated*MergeTreeover Keeper;kblprimary,mzrwarm read replica.
8. Backups
| Store | Cadence | Retention | RTO / RPO |
|---|---|---|---|
Postgres fraud schema | Continuous WAL archive to S3 + nightly logical dump | 35 days WAL, 90 days logical | RTO 30 min, RPO 5 min |
ClickHouse fraud_features | Daily incremental via clickhouse-backup to S3 | 30 days incremental | RTO 4 h (acceptable — async pipelines tolerate gap), RPO 24 h |
| MinIO model artifacts + feed exports | Object versioning + cross-region replication | Indefinite (5y minimum) | RTO 1 h, RPO 0 |