Fraud Intelligence Service — Security Model

Version: 1.0 Status: Draft Owner: Trust and Safety + Security Last Updated: 2026-04-21 Companion: API_CONTRACTS · DATA_MODEL · EVENT_SCHEMAS · docs/13-security-compliance-tenancy.md

1. Authentication

1.1 gRPC plane — `Score`, `BulkScore`, `GetSignals`

mTLS required. The gRPC server accepts only connections presenting a client certificate signed by the platform CA.
SPIFFE ID allowlist (validated at handler entry):
- spiffe://ghasi/compliance-engine
- spiffe://ghasi/routing-engine
- spiffe://ghasi/sender-id-registry
- spiffe://ghasi/noc-dashboard (read-only)
Other certificates are rejected with PERMISSION_DENIED and an audit-log entry is written.
Client certs are mounted from Vault via the Vault Agent Sidecar Injector, rotated every 30 days; the server hot-reloads on file change.
Local dev bypass via GRPC_TLS_ENABLED=false is prohibited in any non-local environment (start-up guard refuses to boot when NODE_ENV != 'development').

1.2 REST plane — admin (port 3014, behind Kong)

Kong validates the platform JWT (issued by auth-service, RS256, JWKS-backed).
Kong forwards X-User-Id, X-Roles, X-Org-Id, X-Trace-Id headers; fraud-intel-service trusts Kong's injection, never parses JWTs directly.
All admin endpoints require an explicit role (see §2).

1.3 Internal mTLS plane — port 3015 (no Kong)

Reserved for regulator-portal-service (CN: regulator-portal) and peer-MNO services.
Used for MISP feed import (POST /v1/internal/fraud/feed/import) and late-arriving signal backfill.
SPIFFE ID + body signature verification (HSM key from Vault PKI / external regulator key).

1.4 IdP-agnostic

The identity provider that authenticated the caller is irrelevant. fraud-intel-service sees only the platform JWT (via Kong) or the SPIFFE ID (via mTLS). The idp claim is captured in fraud.audit_log.before/after for forensics.

2. Authorization (RBAC)

Role	Capabilities
`tns-fraud-analyst`	Read cases/detections/signals; decide PENDING cases; view evidence (no raw subscriber MSISDN visible in cross-tenant context)
`tns-fraud-analyst-lead`	All `tns-fraud-analyst` + assign, bulk-decide, manage allowlists (with secondary approver), trigger ad-hoc scans
`tns-ds` (data scientist)	Register/shadow/promote/rollback model versions; trigger training runs; read evaluation metrics; cannot decide cases
`noc-operator`	Read NOC dashboards, detections, signals; read tenant scores; cannot decide cases or modify rules
`platform.compliance.admin`	Secondary approver for model promotion + allowlist mutations; manages feed registrations; reads audit log
`platform.auditor`	Read-only on detections, cases, decisions, models, audit log; no PII visible
Tenant `sms:fraud:read` (future)	Read own tenant's score and tier (not currently exposed)

Enforcement points:

NestJS RoleGuard rejects with 403 INSUFFICIENT_SCOPE before handler entry.
Per-handler @RequireRoles(...) decorator (declarative, contract-tested).
Postgres RLS on fraud.entity_scores (scores_tenant_read policy).
Separation-of-duties guard on POST /v1/fraud/cases/{caseId}/decide (opened_by != decided_by).
Two-person rule on allowlist mutations (addedBy != approvedBy).
Secondary-approver guard on POST /v1/admin/fraud/models/{id}/promote (tns-ds initiator + platform.compliance.admin approver).

3. Data Protection

3.1 PII inventory & classification

Field	Classification	Storage	Transit
`signals.dst_msisdn`, `signals.src_msisdn`	CONFIDENTIAL (subscriber MSISDN)	Postgres + ClickHouse with disk encryption (LUKS); intra-VPC only	TLS 1.2+
`feed_indicators.value` (when MSISDN/MSISDN_BLOCK)	CONFIDENTIAL	Same	TLS 1.2+
`cases.evidence` (feature vectors, no body)	INTERNAL	Plain	TLS 1.2+
`events` (ClickHouse) `dst_msisdn`	CONFIDENTIAL	Disk-encrypted MergeTree	mTLS
`event_msisdnHash` in NATS events	INTERNAL	n/a	mTLS
`audit_log.before/after`	CONFIDENTIAL (may contain MSISDN)	Plain (Postgres only)	TLS 1.2+
`model_versions.training_set_hash`	INTERNAL	Plain	TLS 1.2+

Notable absence: the fraud-intel-service does not store SMS body content. Only template_hash (digit-folded sha256) is stored.

3.2 Encryption keys

Key	Store	Rotation
mTLS server + client certs	Vault PKI engine	30 days
Postgres credentials	Vault DB dynamic secret	24 h
ClickHouse credentials	Vault KV	30 days
Redis credentials	Vault KV	30 days
MinIO access keys (model artifacts, feed exports)	Vault KV	90 days
HSM signing key (PKCS#11 partition)	HSM (shared with sms-firewall-service, isolated partition)	Annual; per-key revocation on incident
Cosign signing key (optional model chain)	Vault KV (private) + public registry	6 months
`nationalSalt` for `msisdnHash`	Vault KV (high-sensitivity)	Annual (intentional cross-day cross-context unlinkability)
Regulator HSM public key (for import verification)	Vault KV (peer-key store)	On regulator request

3.3 Redaction rules

In events. Cross-context fraud events use msisdnHash, never raw MSISDN. Tenant-scoped events may carry raw senderId (tenant-public) but never raw subscriber MSISDN unless consumer is the tenant's owner service.
In logs. Pino redactor masks dst_msisdn, src_msisdn to +CCNNN***. ESLint rule forbids logger.info(..., { msisdn }) patterns at PR time.
In REST responses. EvidenceRedactor interceptor replaces raw MSISDNs with +CCNNN*** outside tns-fraud-analyst-lead and platform.compliance.admin roles.

3.4 Model artifact integrity

Every ModelVersion records artifactSha256 at registration.
At load time on Triton, the Python wrapper verifies sha256(downloaded) == artifactSha256. Mismatch refuses load and emits fraud.model.artifact.tamper.v1 (CRITICAL).
Optional Sigstore/cosign signature chain (cosign_signature field) for supply-chain provenance.

3.5 No cloud LLM for PII

Fraud-intel-service inference path is strictly on-cluster. INFERENCE_PROVIDER=triton|mock is the only allowed value; cloud LLM endpoints are not configurable.
An egress NetworkPolicy (see DEPLOYMENT_TOPOLOGY §1) forbids fraud-intel pods from reaching public internet IPs except for regulator-portal-service SFTP push and Vault PKI fetch.

4. Audit

All state changes are recorded in fraud.audit_log with actor, before/after snapshots, IP, user agent, trace ID. The table is append-only at the database level (Postgres rules reject UPDATE and DELETE). Retention ≥ 13 months, enforced by partition pruning.

Audit-relevant state changes additionally publish to fraud.audit.v1 for SIEM ingestion (Splunk / SigNoz consumer).

Triggers for audit entries:

Action	Entity
Create/update/delete	rule pattern, allowlist, feed registration
Decide	case (CONFIRM/DISMISS/REFINE)
Promote / rollback	model version
Suppress	detection (allowlist match)
Override	tenant tier, score
Import / export	MISP feed
Read of raw MSISDN by analyst	(meta-audit — auditor reads of analyst-level data are also audit-logged)

5. Fail-Soft / Fail-Closed-with-Default Posture

The fraud-intel-service is informational, not enforcing. Operational implications:

Detection pipelines fail-soft. A missed window is acceptable; the next pipeline run picks up backlog. Alerts fire on > 15 min lag.
Score gRPC fail-closed-with-default. Caller treats unavailability as tier = PROBATION (neutral). This avoids cascading a fraud-intel outage into a compliance freeze. Compliance-engine consumers log the fall-through but proceed.
Action dispatch (HITL executeAction) fail-closed. If the downstream NATS publish for the suggested action fails, the case is reverted to CONFIRMED (not EXECUTED) and FraudActionDispatchFailed HIGH alert fires; the analyst is notified to retry.
MISP feed export fail-loud. A missed export day emits fraud.alert.feed.export.missed.v1 (PagerDuty MEDIUM); regulator SLA is enforced by the heartbeat mechanism.

Security-wise, fail-soft on detection means availability attacks cannot disable enforcement — they can only delay detections; existing perimeter controls (sms-firewall-service rules, compliance-engine policy) remain in force.

6. Tenant Isolation

Postgres RLS on fraud.entity_scores keyed on tenant_id for the (future) tenant portal.
ClickHouse cluster has a single physical schema; isolation is enforced at the application layer (tenant_id predicate on every read). Tenant-scoped graph features are computed per-tenant; cross-tenant cohort detection is allowed and is the entire point of UC-04.
Per-tenant nationalSalt is not used (one platform salt) — cross-tenant cohort hashing requires a shared salt.
tns-fraud-analyst roles are platform-wide (Trust & Safety operates as a cross-tenant function).

7. Threat Model

Threat	Likelihood	Impact	Mitigation
Model-poisoning via HITL feedback	Medium	High	Two-person rule on allowlist; per-cohort fairness audit on every retrain; outlier detection on feedback decisions; data scientist must review training set diff before promotion
Adversarial evasion (paraphrased OTP-keyword templates, MSISDN-block sweeps)	High	Medium	Adversarial-corpus test set must pass recall ≥ 0.80; continuous corpus expansion; ensemble (XGBoost + iForest + rule-based) raises evasion cost
Fraud feed injection (malicious MISP indicator from compromised peer)	Low	High	Signature verification mandatory; per-source reputation scoring; max-rate limit on imports; analyst review of high-impact indicators
Model artifact tampering	Low	Critical	SHA-256 verification on every load; Triton refuses to start with mismatched hash; optional cosign chain; MinIO bucket-policy denies writes from non-CI principals
Score-API DoS (high-frequency Score calls from compromised pod)	Low	Medium	mTLS + SPIFFE allowlist; per-cert per-pod rate limit (20K rps); HPA scales out; Redis L1 absorbs spike
Compromised analyst account	Low	High	MFA enforced; all decisions audit-logged with before/after; anomaly detection on decision velocity (`> 50 decisions/h` flags for review)
Cross-tenant MSISDN inference via cohort hash	Low	Medium	Cohort hash is `cityHash64` (collision-resistant for size-bounded sets); per-day salt rotation breaks cross-day correlation
Training-data leakage via shadow predictions	Low	Medium	Shadow predictions stored without raw MSISDNs; only feature vectors and scores
Feedback loop poisoning (analyst marks legitimate traffic as fraud to harm a competitor)	Low	High	Separation of duties; quarterly cross-validation; analyst decisions visible in audit log to compliance team; tenant appeals route to compliance-engine, not back to fraud-intel-service
HSM signing-key compromise	Very Low	Critical	HSM partition isolated from sms-firewall-service partition; quarterly key audit; revocation via Vault PKI revocation list

8. Secrets Management

Secret	Store	Injected as
gRPC server cert + key	Vault PKI → K8s Secret (Vault Agent)	File mount `/etc/tls/server.{crt,key}`
gRPC client cert (egress)	Vault PKI → K8s Secret	File mount
Postgres credentials	Vault DB dynamic secret	Env var `DATABASE_URL`
ClickHouse credentials	Vault KV	Env var
Redis credentials	Vault KV	Env var
NATS credentials	Vault KV	Env var + nkey file mount
MinIO access keys	Vault KV	Env var
HSM partition PIN	Vault KV (gated by SOC role)	Env var (memory-only; never written to disk)
`nationalSalt` for hashes	Vault KV	Env var (read once at boot, kept in memory)
Regulator HSM public key	Vault KV (peer-key store)	File mount

No secret is ever written to logs, events, or config files. Pre-commit gitleaks scan blocks accidental commits.

Right to erasure (where applicable in regulated jurisdictions): On auth.user.erased.v1, fraud-intel-service:
- Redacts signals.src_msisdn / signals.dst_msisdn → [ERASED:<sha-tombstone>].
- Does not delete audit_log or cases rows; retention for fraud investigation overrides erasure for the limited fields required by national fraud-investigation law.
- Cohort hashes are unchanged (one-way; no reverse lookup possible).
Data minimisation: No SMS body content stored. Only template hashes and aggregate features.
Sub-processor list: No external sub-processors for fraud detection (all on-cluster). Regulator SFTP mirror is per the regulator agreement.

10. Security Testing

Contract tests on gRPC + REST per API_CONTRACTS §8.
Adversarial corpus test on every model promotion (recall ≥ 0.80).
Property-based tests on feature transformers (deterministic across re-runs, no NaN propagation).
Role-matrix integration test — every endpoint × every role — verifies 200/403 behaviour.
Penetration test quarterly scoped to admin REST + internal MISP import endpoints.
ZAP baseline + API scan on every main-branch build.
Secret scanning (gitleaks); dependency scanning (osv-scanner); container scanning (trivy); model artifact scanning (cosign verify).
HSM signing-key access logged to SOC; quarterly key-use audit.
Feed-import signature-failure simulation: every quarter, an intentionally bad signature is sent; verify fraud.alert.feed.signature.invalid.v1 fires within 60 s and PagerDuty page is acknowledged.

11. Regulatory Posture

Aligned with GSMA FF.21 (A2P SMS Fraud Reference) on fraud taxonomy and inter-MNO cooperation.
MEF MEF-W63 (Inter-Carrier Fraud) terminology used in fraud feed export attributes.
ATRA national fraud-feed format mirrored daily via SFTP push.
Data-residency: All fraud signal storage and inference is in-country (kbl primary, mzr DR). Cross-border export is limited to MISP feed exports under regulator agreement.
Audit retention: 13 months minimum on fraud.audit_log; 7 years on fraud.cases (national fraud-investigation statute).

1. Authentication​

1.1 gRPC plane — Score, BulkScore, GetSignals​

1.2 REST plane — admin (port 3014, behind Kong)​

1.3 Internal mTLS plane — port 3015 (no Kong)​

1.4 IdP-agnostic​

2. Authorization (RBAC)​

3. Data Protection​

3.1 PII inventory & classification​

3.2 Encryption keys​

3.3 Redaction rules​

3.4 Model artifact integrity​

3.5 No cloud LLM for PII​

4. Audit​

5. Fail-Soft / Fail-Closed-with-Default Posture​

6. Tenant Isolation​

7. Threat Model​

8. Secrets Management​

9. GDPR / Data-Subject Rights​

10. Security Testing​

11. Regulatory Posture​