Fraud Intelligence Service — Security Model
Version: 1.0 Status: Draft Owner: Trust and Safety + Security Last Updated: 2026-04-21 Companion: API_CONTRACTS · DATA_MODEL · EVENT_SCHEMAS · docs/13-security-compliance-tenancy.md
1. Authentication
1.1 gRPC plane — Score, BulkScore, GetSignals
- mTLS required. The gRPC server accepts only connections presenting a client certificate signed by the platform CA.
- SPIFFE ID allowlist (validated at handler entry):
spiffe://ghasi/compliance-enginespiffe://ghasi/routing-enginespiffe://ghasi/sender-id-registryspiffe://ghasi/noc-dashboard(read-only)
- Other certificates are rejected with
PERMISSION_DENIEDand an audit-log entry is written. - Client certs are mounted from Vault via the Vault Agent Sidecar Injector, rotated every 30 days; the server hot-reloads on file change.
- Local dev bypass via
GRPC_TLS_ENABLED=falseis prohibited in any non-local environment (start-up guard refuses to boot whenNODE_ENV != 'development').
1.2 REST plane — admin (port 3014, behind Kong)
- Kong validates the platform JWT (issued by
auth-service, RS256, JWKS-backed). - Kong forwards
X-User-Id,X-Roles,X-Org-Id,X-Trace-Idheaders; fraud-intel-service trusts Kong's injection, never parses JWTs directly. - All admin endpoints require an explicit role (see §2).
1.3 Internal mTLS plane — port 3015 (no Kong)
- Reserved for
regulator-portal-service(CN:regulator-portal) and peer-MNO services. - Used for MISP feed import (
POST /v1/internal/fraud/feed/import) and late-arriving signal backfill. - SPIFFE ID + body signature verification (HSM key from Vault PKI / external regulator key).
1.4 IdP-agnostic
The identity provider that authenticated the caller is irrelevant. fraud-intel-service sees only the platform JWT (via Kong) or the SPIFFE ID (via mTLS). The idp claim is captured in fraud.audit_log.before/after for forensics.
2. Authorization (RBAC)
| Role | Capabilities |
|---|---|
tns-fraud-analyst | Read cases/detections/signals; decide PENDING cases; view evidence (no raw subscriber MSISDN visible in cross-tenant context) |
tns-fraud-analyst-lead | All tns-fraud-analyst + assign, bulk-decide, manage allowlists (with secondary approver), trigger ad-hoc scans |
tns-ds (data scientist) | Register/shadow/promote/rollback model versions; trigger training runs; read evaluation metrics; cannot decide cases |
noc-operator | Read NOC dashboards, detections, signals; read tenant scores; cannot decide cases or modify rules |
platform.compliance.admin | Secondary approver for model promotion + allowlist mutations; manages feed registrations; reads audit log |
platform.auditor | Read-only on detections, cases, decisions, models, audit log; no PII visible |
Tenant sms:fraud:read (future) | Read own tenant's score and tier (not currently exposed) |
Enforcement points:
- NestJS
RoleGuardrejects with 403INSUFFICIENT_SCOPEbefore handler entry. - Per-handler
@RequireRoles(...)decorator (declarative, contract-tested). - Postgres RLS on
fraud.entity_scores(scores_tenant_readpolicy). - Separation-of-duties guard on
POST /v1/fraud/cases/{caseId}/decide(opened_by != decided_by). - Two-person rule on allowlist mutations (
addedBy != approvedBy). - Secondary-approver guard on
POST /v1/admin/fraud/models/{id}/promote(tns-dsinitiator +platform.compliance.adminapprover).
3. Data Protection
3.1 PII inventory & classification
| Field | Classification | Storage | Transit |
|---|---|---|---|
signals.dst_msisdn, signals.src_msisdn | CONFIDENTIAL (subscriber MSISDN) | Postgres + ClickHouse with disk encryption (LUKS); intra-VPC only | TLS 1.2+ |
feed_indicators.value (when MSISDN/MSISDN_BLOCK) | CONFIDENTIAL | Same | TLS 1.2+ |
cases.evidence (feature vectors, no body) | INTERNAL | Plain | TLS 1.2+ |
events (ClickHouse) dst_msisdn | CONFIDENTIAL | Disk-encrypted MergeTree | mTLS |
event_msisdnHash in NATS events | INTERNAL | n/a | mTLS |
audit_log.before/after | CONFIDENTIAL (may contain MSISDN) | Plain (Postgres only) | TLS 1.2+ |
model_versions.training_set_hash | INTERNAL | Plain | TLS 1.2+ |
Notable absence: the fraud-intel-service does not store SMS body content. Only template_hash (digit-folded sha256) is stored.
3.2 Encryption keys
| Key | Store | Rotation |
|---|---|---|
| mTLS server + client certs | Vault PKI engine | 30 days |
| Postgres credentials | Vault DB dynamic secret | 24 h |
| ClickHouse credentials | Vault KV | 30 days |
| Redis credentials | Vault KV | 30 days |
| MinIO access keys (model artifacts, feed exports) | Vault KV | 90 days |
| HSM signing key (PKCS#11 partition) | HSM (shared with sms-firewall-service, isolated partition) | Annual; per-key revocation on incident |
| Cosign signing key (optional model chain) | Vault KV (private) + public registry | 6 months |
nationalSalt for msisdnHash | Vault KV (high-sensitivity) | Annual (intentional cross-day cross-context unlinkability) |
| Regulator HSM public key (for import verification) | Vault KV (peer-key store) | On regulator request |
3.3 Redaction rules
- In events. Cross-context fraud events use
msisdnHash, never raw MSISDN. Tenant-scoped events may carry rawsenderId(tenant-public) but never raw subscriber MSISDN unless consumer is the tenant's owner service. - In logs. Pino redactor masks
dst_msisdn,src_msisdnto+CCNNN***. ESLint rule forbidslogger.info(..., { msisdn })patterns at PR time. - In REST responses.
EvidenceRedactorinterceptor replaces raw MSISDNs with+CCNNN***outsidetns-fraud-analyst-leadandplatform.compliance.adminroles.
3.4 Model artifact integrity
- Every
ModelVersionrecordsartifactSha256at registration. - At load time on Triton, the Python wrapper verifies
sha256(downloaded) == artifactSha256. Mismatch refuses load and emitsfraud.model.artifact.tamper.v1(CRITICAL). - Optional Sigstore/cosign signature chain (
cosign_signaturefield) for supply-chain provenance.
3.5 No cloud LLM for PII
- Fraud-intel-service inference path is strictly on-cluster.
INFERENCE_PROVIDER=triton|mockis the only allowed value; cloud LLM endpoints are not configurable. - An egress NetworkPolicy (see DEPLOYMENT_TOPOLOGY §1) forbids fraud-intel pods from reaching public internet IPs except for
regulator-portal-serviceSFTP push and Vault PKI fetch.
4. Audit
All state changes are recorded in fraud.audit_log with actor, before/after snapshots, IP, user agent, trace ID. The table is append-only at the database level (Postgres rules reject UPDATE and DELETE). Retention ≥ 13 months, enforced by partition pruning.
Audit-relevant state changes additionally publish to fraud.audit.v1 for SIEM ingestion (Splunk / SigNoz consumer).
Triggers for audit entries:
| Action | Entity |
|---|---|
| Create/update/delete | rule pattern, allowlist, feed registration |
| Decide | case (CONFIRM/DISMISS/REFINE) |
| Promote / rollback | model version |
| Suppress | detection (allowlist match) |
| Override | tenant tier, score |
| Import / export | MISP feed |
| Read of raw MSISDN by analyst | (meta-audit — auditor reads of analyst-level data are also audit-logged) |
5. Fail-Soft / Fail-Closed-with-Default Posture
The fraud-intel-service is informational, not enforcing. Operational implications:
- Detection pipelines fail-soft. A missed window is acceptable; the next pipeline run picks up backlog. Alerts fire on > 15 min lag.
- Score gRPC fail-closed-with-default. Caller treats unavailability as
tier = PROBATION(neutral). This avoids cascading a fraud-intel outage into a compliance freeze. Compliance-engine consumers log the fall-through but proceed. - Action dispatch (HITL
executeAction) fail-closed. If the downstream NATS publish for the suggested action fails, the case is reverted toCONFIRMED(notEXECUTED) andFraudActionDispatchFailedHIGH alert fires; the analyst is notified to retry. - MISP feed export fail-loud. A missed export day emits
fraud.alert.feed.export.missed.v1(PagerDuty MEDIUM); regulator SLA is enforced by the heartbeat mechanism.
Security-wise, fail-soft on detection means availability attacks cannot disable enforcement — they can only delay detections; existing perimeter controls (sms-firewall-service rules, compliance-engine policy) remain in force.
6. Tenant Isolation
- Postgres RLS on
fraud.entity_scoreskeyed ontenant_idfor the (future) tenant portal. - ClickHouse cluster has a single physical schema; isolation is enforced at the application layer (
tenant_idpredicate on every read). Tenant-scoped graph features are computed per-tenant; cross-tenant cohort detection is allowed and is the entire point of UC-04. - Per-tenant
nationalSaltis not used (one platform salt) — cross-tenant cohort hashing requires a shared salt. tns-fraud-analystroles are platform-wide (Trust & Safety operates as a cross-tenant function).
7. Threat Model
| Threat | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Model-poisoning via HITL feedback | Medium | High | Two-person rule on allowlist; per-cohort fairness audit on every retrain; outlier detection on feedback decisions; data scientist must review training set diff before promotion |
| Adversarial evasion (paraphrased OTP-keyword templates, MSISDN-block sweeps) | High | Medium | Adversarial-corpus test set must pass recall ≥ 0.80; continuous corpus expansion; ensemble (XGBoost + iForest + rule-based) raises evasion cost |
| Fraud feed injection (malicious MISP indicator from compromised peer) | Low | High | Signature verification mandatory; per-source reputation scoring; max-rate limit on imports; analyst review of high-impact indicators |
| Model artifact tampering | Low | Critical | SHA-256 verification on every load; Triton refuses to start with mismatched hash; optional cosign chain; MinIO bucket-policy denies writes from non-CI principals |
| Score-API DoS (high-frequency Score calls from compromised pod) | Low | Medium | mTLS + SPIFFE allowlist; per-cert per-pod rate limit (20K rps); HPA scales out; Redis L1 absorbs spike |
| Compromised analyst account | Low | High | MFA enforced; all decisions audit-logged with before/after; anomaly detection on decision velocity (> 50 decisions/h flags for review) |
| Cross-tenant MSISDN inference via cohort hash | Low | Medium | Cohort hash is cityHash64 (collision-resistant for size-bounded sets); per-day salt rotation breaks cross-day correlation |
| Training-data leakage via shadow predictions | Low | Medium | Shadow predictions stored without raw MSISDNs; only feature vectors and scores |
| Feedback loop poisoning (analyst marks legitimate traffic as fraud to harm a competitor) | Low | High | Separation of duties; quarterly cross-validation; analyst decisions visible in audit log to compliance team; tenant appeals route to compliance-engine, not back to fraud-intel-service |
| HSM signing-key compromise | Very Low | Critical | HSM partition isolated from sms-firewall-service partition; quarterly key audit; revocation via Vault PKI revocation list |
8. Secrets Management
| Secret | Store | Injected as |
|---|---|---|
| gRPC server cert + key | Vault PKI → K8s Secret (Vault Agent) | File mount /etc/tls/server.{crt,key} |
| gRPC client cert (egress) | Vault PKI → K8s Secret | File mount |
| Postgres credentials | Vault DB dynamic secret | Env var DATABASE_URL |
| ClickHouse credentials | Vault KV | Env var |
| Redis credentials | Vault KV | Env var |
| NATS credentials | Vault KV | Env var + nkey file mount |
| MinIO access keys | Vault KV | Env var |
| HSM partition PIN | Vault KV (gated by SOC role) | Env var (memory-only; never written to disk) |
nationalSalt for hashes | Vault KV | Env var (read once at boot, kept in memory) |
| Regulator HSM public key | Vault KV (peer-key store) | File mount |
No secret is ever written to logs, events, or config files. Pre-commit gitleaks scan blocks accidental commits.
9. GDPR / Data-Subject Rights
- Right to erasure (where applicable in regulated jurisdictions): On
auth.user.erased.v1, fraud-intel-service:- Redacts
signals.src_msisdn/signals.dst_msisdn→[ERASED:<sha-tombstone>]. - Does not delete
audit_logorcasesrows; retention for fraud investigation overrides erasure for the limited fields required by national fraud-investigation law. - Cohort hashes are unchanged (one-way; no reverse lookup possible).
- Redacts
- Data minimisation: No SMS body content stored. Only template hashes and aggregate features.
- Sub-processor list: No external sub-processors for fraud detection (all on-cluster). Regulator SFTP mirror is per the regulator agreement.
10. Security Testing
- Contract tests on gRPC + REST per API_CONTRACTS §8.
- Adversarial corpus test on every model promotion (recall ≥ 0.80).
- Property-based tests on feature transformers (deterministic across re-runs, no NaN propagation).
- Role-matrix integration test — every endpoint × every role — verifies 200/403 behaviour.
- Penetration test quarterly scoped to admin REST + internal MISP import endpoints.
- ZAP baseline + API scan on every main-branch build.
- Secret scanning (
gitleaks); dependency scanning (osv-scanner); container scanning (trivy); model artifact scanning (cosign verify). - HSM signing-key access logged to SOC; quarterly key-use audit.
- Feed-import signature-failure simulation: every quarter, an intentionally bad signature is sent; verify
fraud.alert.feed.signature.invalid.v1fires within 60 s and PagerDuty page is acknowledged.
11. Regulatory Posture
- Aligned with GSMA FF.21 (A2P SMS Fraud Reference) on fraud taxonomy and inter-MNO cooperation.
- MEF MEF-W63 (Inter-Carrier Fraud) terminology used in fraud feed export attributes.
- ATRA national fraud-feed format mirrored daily via SFTP push.
- Data-residency: All fraud signal storage and inference is in-country (kbl primary, mzr DR). Cross-border export is limited to MISP feed exports under regulator agreement.
- Audit retention: 13 months minimum on
fraud.audit_log; 7 years onfraud.cases(national fraud-investigation statute).