Skip to main content

Fraud Intelligence Service — Security Model

Version: 1.0 Status: Draft Owner: Trust and Safety + Security Last Updated: 2026-04-21 Companion: API_CONTRACTS · DATA_MODEL · EVENT_SCHEMAS · docs/13-security-compliance-tenancy.md


1. Authentication

1.1 gRPC plane — Score, BulkScore, GetSignals

  • mTLS required. The gRPC server accepts only connections presenting a client certificate signed by the platform CA.
  • SPIFFE ID allowlist (validated at handler entry):
    • spiffe://ghasi/compliance-engine
    • spiffe://ghasi/routing-engine
    • spiffe://ghasi/sender-id-registry
    • spiffe://ghasi/noc-dashboard (read-only)
  • Other certificates are rejected with PERMISSION_DENIED and an audit-log entry is written.
  • Client certs are mounted from Vault via the Vault Agent Sidecar Injector, rotated every 30 days; the server hot-reloads on file change.
  • Local dev bypass via GRPC_TLS_ENABLED=false is prohibited in any non-local environment (start-up guard refuses to boot when NODE_ENV != 'development').

1.2 REST plane — admin (port 3014, behind Kong)

  • Kong validates the platform JWT (issued by auth-service, RS256, JWKS-backed).
  • Kong forwards X-User-Id, X-Roles, X-Org-Id, X-Trace-Id headers; fraud-intel-service trusts Kong's injection, never parses JWTs directly.
  • All admin endpoints require an explicit role (see §2).

1.3 Internal mTLS plane — port 3015 (no Kong)

  • Reserved for regulator-portal-service (CN: regulator-portal) and peer-MNO services.
  • Used for MISP feed import (POST /v1/internal/fraud/feed/import) and late-arriving signal backfill.
  • SPIFFE ID + body signature verification (HSM key from Vault PKI / external regulator key).

1.4 IdP-agnostic

The identity provider that authenticated the caller is irrelevant. fraud-intel-service sees only the platform JWT (via Kong) or the SPIFFE ID (via mTLS). The idp claim is captured in fraud.audit_log.before/after for forensics.


2. Authorization (RBAC)

RoleCapabilities
tns-fraud-analystRead cases/detections/signals; decide PENDING cases; view evidence (no raw subscriber MSISDN visible in cross-tenant context)
tns-fraud-analyst-leadAll tns-fraud-analyst + assign, bulk-decide, manage allowlists (with secondary approver), trigger ad-hoc scans
tns-ds (data scientist)Register/shadow/promote/rollback model versions; trigger training runs; read evaluation metrics; cannot decide cases
noc-operatorRead NOC dashboards, detections, signals; read tenant scores; cannot decide cases or modify rules
platform.compliance.adminSecondary approver for model promotion + allowlist mutations; manages feed registrations; reads audit log
platform.auditorRead-only on detections, cases, decisions, models, audit log; no PII visible
Tenant sms:fraud:read (future)Read own tenant's score and tier (not currently exposed)

Enforcement points:

  1. NestJS RoleGuard rejects with 403 INSUFFICIENT_SCOPE before handler entry.
  2. Per-handler @RequireRoles(...) decorator (declarative, contract-tested).
  3. Postgres RLS on fraud.entity_scores (scores_tenant_read policy).
  4. Separation-of-duties guard on POST /v1/fraud/cases/{caseId}/decide (opened_by != decided_by).
  5. Two-person rule on allowlist mutations (addedBy != approvedBy).
  6. Secondary-approver guard on POST /v1/admin/fraud/models/{id}/promote (tns-ds initiator + platform.compliance.admin approver).

3. Data Protection

3.1 PII inventory & classification

FieldClassificationStorageTransit
signals.dst_msisdn, signals.src_msisdnCONFIDENTIAL (subscriber MSISDN)Postgres + ClickHouse with disk encryption (LUKS); intra-VPC onlyTLS 1.2+
feed_indicators.value (when MSISDN/MSISDN_BLOCK)CONFIDENTIALSameTLS 1.2+
cases.evidence (feature vectors, no body)INTERNALPlainTLS 1.2+
events (ClickHouse) dst_msisdnCONFIDENTIALDisk-encrypted MergeTreemTLS
event_msisdnHash in NATS eventsINTERNALn/amTLS
audit_log.before/afterCONFIDENTIAL (may contain MSISDN)Plain (Postgres only)TLS 1.2+
model_versions.training_set_hashINTERNALPlainTLS 1.2+

Notable absence: the fraud-intel-service does not store SMS body content. Only template_hash (digit-folded sha256) is stored.

3.2 Encryption keys

KeyStoreRotation
mTLS server + client certsVault PKI engine30 days
Postgres credentialsVault DB dynamic secret24 h
ClickHouse credentialsVault KV30 days
Redis credentialsVault KV30 days
MinIO access keys (model artifacts, feed exports)Vault KV90 days
HSM signing key (PKCS#11 partition)HSM (shared with sms-firewall-service, isolated partition)Annual; per-key revocation on incident
Cosign signing key (optional model chain)Vault KV (private) + public registry6 months
nationalSalt for msisdnHashVault KV (high-sensitivity)Annual (intentional cross-day cross-context unlinkability)
Regulator HSM public key (for import verification)Vault KV (peer-key store)On regulator request

3.3 Redaction rules

  • In events. Cross-context fraud events use msisdnHash, never raw MSISDN. Tenant-scoped events may carry raw senderId (tenant-public) but never raw subscriber MSISDN unless consumer is the tenant's owner service.
  • In logs. Pino redactor masks dst_msisdn, src_msisdn to +CCNNN***. ESLint rule forbids logger.info(..., { msisdn }) patterns at PR time.
  • In REST responses. EvidenceRedactor interceptor replaces raw MSISDNs with +CCNNN*** outside tns-fraud-analyst-lead and platform.compliance.admin roles.

3.4 Model artifact integrity

  • Every ModelVersion records artifactSha256 at registration.
  • At load time on Triton, the Python wrapper verifies sha256(downloaded) == artifactSha256. Mismatch refuses load and emits fraud.model.artifact.tamper.v1 (CRITICAL).
  • Optional Sigstore/cosign signature chain (cosign_signature field) for supply-chain provenance.

3.5 No cloud LLM for PII

  • Fraud-intel-service inference path is strictly on-cluster. INFERENCE_PROVIDER=triton|mock is the only allowed value; cloud LLM endpoints are not configurable.
  • An egress NetworkPolicy (see DEPLOYMENT_TOPOLOGY §1) forbids fraud-intel pods from reaching public internet IPs except for regulator-portal-service SFTP push and Vault PKI fetch.

4. Audit

All state changes are recorded in fraud.audit_log with actor, before/after snapshots, IP, user agent, trace ID. The table is append-only at the database level (Postgres rules reject UPDATE and DELETE). Retention ≥ 13 months, enforced by partition pruning.

Audit-relevant state changes additionally publish to fraud.audit.v1 for SIEM ingestion (Splunk / SigNoz consumer).

Triggers for audit entries:

ActionEntity
Create/update/deleterule pattern, allowlist, feed registration
Decidecase (CONFIRM/DISMISS/REFINE)
Promote / rollbackmodel version
Suppressdetection (allowlist match)
Overridetenant tier, score
Import / exportMISP feed
Read of raw MSISDN by analyst(meta-audit — auditor reads of analyst-level data are also audit-logged)

5. Fail-Soft / Fail-Closed-with-Default Posture

The fraud-intel-service is informational, not enforcing. Operational implications:

  • Detection pipelines fail-soft. A missed window is acceptable; the next pipeline run picks up backlog. Alerts fire on > 15 min lag.
  • Score gRPC fail-closed-with-default. Caller treats unavailability as tier = PROBATION (neutral). This avoids cascading a fraud-intel outage into a compliance freeze. Compliance-engine consumers log the fall-through but proceed.
  • Action dispatch (HITL executeAction) fail-closed. If the downstream NATS publish for the suggested action fails, the case is reverted to CONFIRMED (not EXECUTED) and FraudActionDispatchFailed HIGH alert fires; the analyst is notified to retry.
  • MISP feed export fail-loud. A missed export day emits fraud.alert.feed.export.missed.v1 (PagerDuty MEDIUM); regulator SLA is enforced by the heartbeat mechanism.

Security-wise, fail-soft on detection means availability attacks cannot disable enforcement — they can only delay detections; existing perimeter controls (sms-firewall-service rules, compliance-engine policy) remain in force.


6. Tenant Isolation

  • Postgres RLS on fraud.entity_scores keyed on tenant_id for the (future) tenant portal.
  • ClickHouse cluster has a single physical schema; isolation is enforced at the application layer (tenant_id predicate on every read). Tenant-scoped graph features are computed per-tenant; cross-tenant cohort detection is allowed and is the entire point of UC-04.
  • Per-tenant nationalSalt is not used (one platform salt) — cross-tenant cohort hashing requires a shared salt.
  • tns-fraud-analyst roles are platform-wide (Trust & Safety operates as a cross-tenant function).

7. Threat Model

ThreatLikelihoodImpactMitigation
Model-poisoning via HITL feedbackMediumHighTwo-person rule on allowlist; per-cohort fairness audit on every retrain; outlier detection on feedback decisions; data scientist must review training set diff before promotion
Adversarial evasion (paraphrased OTP-keyword templates, MSISDN-block sweeps)HighMediumAdversarial-corpus test set must pass recall ≥ 0.80; continuous corpus expansion; ensemble (XGBoost + iForest + rule-based) raises evasion cost
Fraud feed injection (malicious MISP indicator from compromised peer)LowHighSignature verification mandatory; per-source reputation scoring; max-rate limit on imports; analyst review of high-impact indicators
Model artifact tamperingLowCriticalSHA-256 verification on every load; Triton refuses to start with mismatched hash; optional cosign chain; MinIO bucket-policy denies writes from non-CI principals
Score-API DoS (high-frequency Score calls from compromised pod)LowMediummTLS + SPIFFE allowlist; per-cert per-pod rate limit (20K rps); HPA scales out; Redis L1 absorbs spike
Compromised analyst accountLowHighMFA enforced; all decisions audit-logged with before/after; anomaly detection on decision velocity (> 50 decisions/h flags for review)
Cross-tenant MSISDN inference via cohort hashLowMediumCohort hash is cityHash64 (collision-resistant for size-bounded sets); per-day salt rotation breaks cross-day correlation
Training-data leakage via shadow predictionsLowMediumShadow predictions stored without raw MSISDNs; only feature vectors and scores
Feedback loop poisoning (analyst marks legitimate traffic as fraud to harm a competitor)LowHighSeparation of duties; quarterly cross-validation; analyst decisions visible in audit log to compliance team; tenant appeals route to compliance-engine, not back to fraud-intel-service
HSM signing-key compromiseVery LowCriticalHSM partition isolated from sms-firewall-service partition; quarterly key audit; revocation via Vault PKI revocation list

8. Secrets Management

SecretStoreInjected as
gRPC server cert + keyVault PKI → K8s Secret (Vault Agent)File mount /etc/tls/server.{crt,key}
gRPC client cert (egress)Vault PKI → K8s SecretFile mount
Postgres credentialsVault DB dynamic secretEnv var DATABASE_URL
ClickHouse credentialsVault KVEnv var
Redis credentialsVault KVEnv var
NATS credentialsVault KVEnv var + nkey file mount
MinIO access keysVault KVEnv var
HSM partition PINVault KV (gated by SOC role)Env var (memory-only; never written to disk)
nationalSalt for hashesVault KVEnv var (read once at boot, kept in memory)
Regulator HSM public keyVault KV (peer-key store)File mount

No secret is ever written to logs, events, or config files. Pre-commit gitleaks scan blocks accidental commits.


9. GDPR / Data-Subject Rights

  • Right to erasure (where applicable in regulated jurisdictions): On auth.user.erased.v1, fraud-intel-service:
    • Redacts signals.src_msisdn / signals.dst_msisdn[ERASED:<sha-tombstone>].
    • Does not delete audit_log or cases rows; retention for fraud investigation overrides erasure for the limited fields required by national fraud-investigation law.
    • Cohort hashes are unchanged (one-way; no reverse lookup possible).
  • Data minimisation: No SMS body content stored. Only template hashes and aggregate features.
  • Sub-processor list: No external sub-processors for fraud detection (all on-cluster). Regulator SFTP mirror is per the regulator agreement.

10. Security Testing

  • Contract tests on gRPC + REST per API_CONTRACTS §8.
  • Adversarial corpus test on every model promotion (recall ≥ 0.80).
  • Property-based tests on feature transformers (deterministic across re-runs, no NaN propagation).
  • Role-matrix integration test — every endpoint × every role — verifies 200/403 behaviour.
  • Penetration test quarterly scoped to admin REST + internal MISP import endpoints.
  • ZAP baseline + API scan on every main-branch build.
  • Secret scanning (gitleaks); dependency scanning (osv-scanner); container scanning (trivy); model artifact scanning (cosign verify).
  • HSM signing-key access logged to SOC; quarterly key-use audit.
  • Feed-import signature-failure simulation: every quarter, an intentionally bad signature is sent; verify fraud.alert.feed.signature.invalid.v1 fires within 60 s and PagerDuty page is acknowledged.

11. Regulatory Posture

  • Aligned with GSMA FF.21 (A2P SMS Fraud Reference) on fraud taxonomy and inter-MNO cooperation.
  • MEF MEF-W63 (Inter-Carrier Fraud) terminology used in fraud feed export attributes.
  • ATRA national fraud-feed format mirrored daily via SFTP push.
  • Data-residency: All fraud signal storage and inference is in-country (kbl primary, mzr DR). Cross-border export is limited to MISP feed exports under regulator agreement.
  • Audit retention: 13 months minimum on fraud.audit_log; 7 years on fraud.cases (national fraud-investigation statute).