Skip to main content

Fraud Intelligence Service — Deployment Topology

Version: 1.0 Status: Draft Owner: Trust and Safety + Platform SRE Last Updated: 2026-04-21 Companion: LOCAL_DEV_SETUP · FAILURE_MODES · SECURITY_MODEL · docs/architecture/ADR-0004


1. Kubernetes Resources

The service splits into three workloads:

  1. fraud-intel-service — NestJS API + gRPC + REST + NATS consumer + outbox relay. Stateless. 3-10 replicas.
  2. fraud-intel-worker — Python ML pipelines (AIT, SIM-box, OTP-harvest, grey-route, cohort, scoring, tenant-score recompute). KEDA-scaled by NATS lag and cron. 0-20 replicas.
  3. triton-fraud-cpu + triton-fraud-gpu — Triton Inference Server for model serving.

Plus the offline training stack:

  • Airflow scheduler + workers (training DAGs)
  • MLflow tracking server
  • GPU training nodes (spot, autoscaled)

1.1 fraud-intel-service Deployment (NestJS)

apiVersion: apps/v1
kind: Deployment
metadata:
name: fraud-intel-service
namespace: sms-platform
spec:
replicas: 3
selector:
matchLabels: { app: fraud-intel-service }
template:
metadata:
labels: { app: fraud-intel-service }
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "3014"
prometheus.io/path: "/metrics"
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector: { matchLabels: { app: fraud-intel-service } }
topologyKey: topology.kubernetes.io/zone
containers:
- name: fraud-intel-service
image: ghcr.io/ghasi/fraud-intel-service:latest
ports:
- { containerPort: 50054, name: grpc }
- { containerPort: 3014, name: http }
- { containerPort: 3015, name: http-internal }
env:
- { name: NODE_ENV, value: production }
- { name: LOG_LEVEL, value: info }
- { name: GRPC_PORT, value: "50054" }
- { name: HTTP_PORT, value: "3014" }
- { name: HTTP_INTERNAL_PORT, value: "3015" }
- name: DATABASE_URL
valueFrom: { secretKeyRef: { name: fraud-intel-db-secret, key: url } }
- name: CLICKHOUSE_URL
valueFrom: { secretKeyRef: { name: fraud-intel-ch-secret, key: url } }
- name: REDIS_URL
valueFrom: { secretKeyRef: { name: fraud-intel-redis-secret, key: url } }
- name: NATS_URL
valueFrom: { secretKeyRef: { name: nats-credentials, key: url } }
- { name: NATS_CREDS_PATH, value: /etc/nats/creds.nk }
- { name: TRITON_GRPC_URL, value: triton-fraud-cpu.sms-platform.svc.cluster.local:8001 }
- { name: TRITON_GPU_GRPC_URL, value: triton-fraud-gpu.sms-platform.svc.cluster.local:8001 }
- { name: INFERENCE_PROVIDER, value: triton }
- { name: ANONYMIZE_BEFORE_INFERENCE, value: "true" }
- { name: NATIONAL_SALT_PATH, value: /etc/secrets/national-salt }
- { name: SCORE_CACHE_TTL_S, value: "900" }
- { name: REGION, value: kbl }
envFrom:
- { secretRef: { name: fraud-intel-vault-secrets } }
resources:
requests: { cpu: 1000m, memory: 1Gi }
limits: { cpu: 4000m, memory: 4Gi }
livenessProbe:
httpGet: { path: /health/live, port: http }
initialDelaySeconds: 20
periodSeconds: 10
readinessProbe:
httpGet: { path: /health/ready, port: http }
initialDelaySeconds: 15
periodSeconds: 5
failureThreshold: 3
volumeMounts:
- { name: tls-certs, mountPath: /etc/tls, readOnly: true }
- { name: nats-creds, mountPath: /etc/nats, readOnly: true }
- { name: secrets, mountPath: /etc/secrets, readOnly: true }
volumes:
- { name: tls-certs, secret: { secretName: fraud-intel-tls } }
- { name: nats-creds, secret: { secretName: fraud-intel-nats-creds } }
- { name: secrets, secret: { secretName: fraud-intel-app-secrets } }

1.2 fraud-intel-worker Deployment (Python ML pipelines)

apiVersion: apps/v1
kind: Deployment
metadata: { name: fraud-intel-worker, namespace: sms-platform }
spec:
replicas: 2
selector: { matchLabels: { app: fraud-intel-worker } }
template:
metadata:
labels: { app: fraud-intel-worker }
annotations: { prometheus.io/scrape: "true", prometheus.io/port: "9091" }
spec:
nodeSelector: { workload: ml-cpu }
containers:
- name: worker
image: ghcr.io/ghasi/fraud-intel-worker:latest
env:
- { name: WORKER_MODE, value: pipelines } # pipelines | streaming | scoring
- { name: TRITON_GRPC_URL, value: triton-fraud-cpu.sms-platform.svc.cluster.local:8001 }
- { name: CLICKHOUSE_URL, valueFrom: { secretKeyRef: { name: fraud-intel-ch-secret, key: url } } }
- { name: PG_URL, valueFrom: { secretKeyRef: { name: fraud-intel-db-secret, key: url } } }
- { name: REDIS_URL, valueFrom: { secretKeyRef: { name: fraud-intel-redis-secret, key: url } } }
- { name: NATS_URL, valueFrom: { secretKeyRef: { name: nats-credentials, key: url } } }
- { name: METRICS_PORT, value: "9091" }
resources:
requests: { cpu: 4, memory: 16Gi }
limits: { cpu: 8, memory: 32Gi }

1.3 KEDA scaler (worker autoscaling on NATS lag)

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata: { name: fraud-intel-worker-scaler, namespace: sms-platform }
spec:
scaleTargetRef: { name: fraud-intel-worker }
minReplicaCount: 2
maxReplicaCount: 20
triggers:
- type: nats-jetstream
metadata:
natsServerMonitoringEndpoint: nats.sms-platform.svc:8222
stream: SMS_STATUS
consumer: fraud-ingestor
lagThreshold: "5000"
- type: cron
metadata:
timezone: Asia/Kabul
start: "*/5 * * * *"
end: "*/5 * * * *"
desiredReplicas: "5"

1.4 Triton Inference Server (CPU pool)

apiVersion: apps/v1
kind: Deployment
metadata: { name: triton-fraud-cpu, namespace: sms-platform }
spec:
replicas: 3
selector: { matchLabels: { app: triton-fraud-cpu } }
template:
metadata:
labels: { app: triton-fraud-cpu }
annotations: { prometheus.io/scrape: "true", prometheus.io/port: "8002" }
spec:
containers:
- name: triton
image: nvcr.io/nvidia/tritonserver:24.06-py3
args:
- tritonserver
- --model-repository=/models
- --model-control-mode=poll
- --repository-poll-secs=30
- --strict-model-config=false
- --backend-config=fil,backend_config.cmdline=use_cuda=false
ports:
- { containerPort: 8000, name: http }
- { containerPort: 8001, name: grpc }
- { containerPort: 8002, name: metrics }
resources:
requests: { cpu: 4, memory: 8Gi }
limits: { cpu: 16, memory: 16Gi }
volumeMounts:
- { name: model-repo, mountPath: /models, readOnly: true }
volumes:
- { name: model-repo, persistentVolumeClaim: { claimName: triton-model-repo-pvc } }

1.5 Triton Inference Server (GPU pool)

apiVersion: apps/v1
kind: Deployment
metadata: { name: triton-fraud-gpu, namespace: sms-platform }
spec:
replicas: 2
template:
spec:
nodeSelector: { gpu: t4 }
tolerations:
- { key: nvidia.com/gpu, operator: Exists, effect: NoSchedule }
containers:
- name: triton
image: nvcr.io/nvidia/tritonserver:24.06-py3
args: [ tritonserver, --model-repository=/models, --strict-model-config=false ]
resources:
requests: { cpu: 4, memory: 16Gi, nvidia.com/gpu: 1 }
limits: { cpu: 8, memory: 24Gi, nvidia.com/gpu: 1 }

1.6 HPA for fraud-intel-service

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata: { name: fraud-intel-service-hpa, namespace: sms-platform }
spec:
scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: fraud-intel-service }
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource: { name: cpu, target: { type: Utilization, averageUtilization: 65 } }
- type: Resource
resource: { name: memory, target: { type: Utilization, averageUtilization: 75 } }
- type: Pods
pods:
metric: { name: fraud_score_grpc_duration_seconds_p95 }
target: { type: AverageValue, averageValue: "0.04" } # scale up if P95 > 40 ms

1.7 PodDisruptionBudgets

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata: { name: fraud-intel-service-pdb, namespace: sms-platform }
spec:
minAvailable: 2
selector: { matchLabels: { app: fraud-intel-service } }
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata: { name: triton-fraud-cpu-pdb, namespace: sms-platform }
spec:
minAvailable: 2
selector: { matchLabels: { app: triton-fraud-cpu } }
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata: { name: triton-fraud-gpu-pdb, namespace: sms-platform }
spec:
minAvailable: 1
selector: { matchLabels: { app: triton-fraud-gpu } }

1.8 Services

apiVersion: v1
kind: Service
metadata: { name: fraud-intel-grpc, namespace: sms-platform }
spec:
selector: { app: fraud-intel-service }
ports: [{ name: grpc, port: 50054, targetPort: grpc }]
type: ClusterIP
---
apiVersion: v1
kind: Service
metadata: { name: fraud-intel-http, namespace: sms-platform }
spec:
selector: { app: fraud-intel-service }
ports: [{ name: http, port: 3014, targetPort: http }]
type: ClusterIP
---
apiVersion: v1
kind: Service
metadata: { name: fraud-intel-internal, namespace: sms-platform }
spec:
selector: { app: fraud-intel-service }
ports: [{ name: http-internal, port: 3015, targetPort: http-internal }]
type: ClusterIP
---
apiVersion: v1
kind: Service
metadata: { name: triton-fraud-cpu, namespace: sms-platform }
spec:
selector: { app: triton-fraud-cpu }
ports:
- { name: http, port: 8000, targetPort: http }
- { name: grpc, port: 8001, targetPort: grpc }
- { name: metrics, port: 8002, targetPort: metrics }

1.9 NetworkPolicy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: fraud-intel-netpol, namespace: sms-platform }
spec:
podSelector: { matchLabels: { app: fraud-intel-service } }
policyTypes: [Ingress, Egress]
ingress:
# gRPC consumers
- from:
- { podSelector: { matchLabels: { app: compliance-engine } } }
- { podSelector: { matchLabels: { app: routing-engine } } }
- { podSelector: { matchLabels: { app: sender-id-registry-service } } }
- { podSelector: { matchLabels: { app: noc-dashboard } } }
ports: [{ port: 50054 }]
# REST admin (via Kong)
- from: [{ podSelector: { matchLabels: { app: kong } } }]
ports: [{ port: 3014 }]
# Internal mTLS (regulator-portal-service, peer-mno-bridge)
- from:
- { podSelector: { matchLabels: { app: regulator-portal-service } } }
- { podSelector: { matchLabels: { app: peer-mno-bridge } } }
ports: [{ port: 3015 }]
# Prometheus
- from: [{ namespaceSelector: { matchLabels: { name: monitoring } } }]
ports: [{ port: 3014 }]
egress:
- to: [{ podSelector: { matchLabels: { app: postgresql } } }]
ports: [{ port: 5432 }]
- to: [{ podSelector: { matchLabels: { app: clickhouse } } }]
ports: [{ port: 9000 }, { port: 8123 }]
- to: [{ podSelector: { matchLabels: { app: redis } } }]
ports: [{ port: 6379 }]
- to: [{ podSelector: { matchLabels: { app: nats } } }]
ports: [{ port: 4222 }]
- to: [{ podSelector: { matchLabels: { app: triton-fraud-cpu } } }]
ports: [{ port: 8001 }]
- to: [{ podSelector: { matchLabels: { app: triton-fraud-gpu } } }]
ports: [{ port: 8001 }]
- to: [{ podSelector: { matchLabels: { app: minio } } }]
ports: [{ port: 9000 }]
- to: [{ podSelector: { matchLabels: { app: vault } } }]
ports: [{ port: 8200 }]
# Egress to regulator SFTP (cloud — IP-allowlisted)
- to:
- ipBlock: { cidr: 41.74.0.0/16 } # ATRA SFTP CIDR (placeholder)
ports: [{ port: 22 }]
# NO egress to public LLM providers — explicit deny by omission

2. Background Workers

WorkerScheduleReplicasDescription
IngestionConsumeralways-onKEDA (NATS lag)Stream firewall.audit.v1, sms.events.status.v1, sms.dlr.inbound.v1, cdr.generated.v1, consent.revoked.v1 → ClickHouse events
OtpGrindingStreamingalways-on2Real-time OTP-grinding aggregator (Redis sorted sets)
AitPipeline*/5 * * * *1-3 (KEDA cron)5-min AIT XGBoost pipeline
AitCohortJob0 * * * *1Hourly cohort GraphSAGE
SimboxPipeline*/30 * * * *130-min SIM-box detector
GreyRoutePipeline15 * * * *1Hourly grey-route
OtpHarvestPipeline*/30 * * * *130-min OTP-harvest cohort+revocation
TenantScoreRecompute0 * * * *1Hourly score recompute
MispFeedExport0 4 * * * Asia/Kabul1Daily MISP/STIX export to MinIO + SFTP
MispFeedDecayJob0 5 * * *1Apply daily decay to imported indicators
PartitionMaintenance0 3 * * *1Provision next 3 months of Postgres partitions
OutboxRelayalways-on (in-process)per podPublishes outbox to NATS
CaseStaleScanner0 6 * * *1Auto-close cases > 30 d
ModelDriftScanner0 1 * * *1PSI / Wasserstein drift checks; emit alerts

Workers use Redis distributed locks for multi-replica safety (SET NX EX on fraud:lock:<worker>).


3. Region Affinity

Per ADR-0004:

RegionRoleReplicas (service / worker / Triton-CPU / Triton-GPU)
kbl (Kabul)Primary write region; all pipelines run here; canonical model registry3 / 2-20 / 3 / 2
mzr (Mazar-i-Sharif)Warm standby; reads entity_scores from Postgres replica; Score gRPC serves regional traffic2 / 0 (paused) / 2 / 0
Failover RTO5 min (manual operator confirmation per ADR-0004 §3.4)
Cross-region NATS bridgingNATS Leaf Node FRAUD_* streams mirror kbl → mzrLag P95 ≤ 5 s

4. Infrastructure Dependencies

DependencyVersionTopologyOwner
PostgreSQL15+Primary + read replica per region; PgBouncer in transaction pool modePlatform DBA
ClickHouse23.8+3 shards × 2 replicas; ZooKeeper/Keeper coordinationPlatform SRE (data)
Redis7.0+Cluster mode; fraud uses DB 4Platform SRE
NATS JetStream2.10+3-node cluster; dedicated FRAUD_* streamsPlatform SRE
Triton Inference Server24.06+CPU pool (3 replicas) + GPU pool (2 × T4)Platform Engineering
MinIORELEASE.2024-08-29T+4-node erasure-coded cluster; bucket-policy enforcedPlatform SRE
Vault1.16+HA mode; PKI engine for mTLS, KV v2 for secrets, Transit for hashingSecurity
HSM (PKCS#11)nCipher nShield (shared)Isolated partition fraud-intel from sms-firewallSecurity
Airflow2.9+KubernetesExecutor on dedicated airflow-fraud namespaceData Engineering
MLflow2.13+Tracking server + S3 artifact storeData Engineering

5. Environment Variables

VariableRequiredDefaultDescription
NODE_ENVYesproduction / staging / development
GRPC_PORTNo50054gRPC listener
HTTP_PORTNo3014REST listener
HTTP_INTERNAL_PORTNo3015Internal mTLS listener
DATABASE_URLYesPostgres connection string
CLICKHOUSE_URLYesClickHouse native protocol URL
REDIS_URLYesRedis (DB 4)
NATS_URLYesNATS server URL
NATS_CREDS_PATHYesPath to NATS credentials nkey file
TRITON_GRPC_URLYesTriton CPU pool gRPC
TRITON_GPU_GRPC_URLYesTriton GPU pool gRPC
INFERENCE_PROVIDERNotritontriton / mock only
ANONYMIZE_BEFORE_INFERENCENotrueForced true in non-dev
NATIONAL_SALT_PATHYesFile mount of nationalSalt
SCORE_CACHE_TTL_SNo900Redis L1 TTL
SCORE_REFRESH_QUEUENofraud:score:refresh:queueRedis list name
EVAL_BUDGET_MSNo45Score gRPC internal budget
REGIONYeskbl / mzr
GRPC_TLS_ENABLEDNotrueForced true non-dev (start-up guard)
TLS_CERT_PATH, TLS_KEY_PATH, TLS_CA_PATHIf TLSmTLS certs
LOG_LEVELNoinfodebug / info / warn / error
HSM_PIN_PATHIf feed exportFile mount of HSM partition PIN

INFERENCE_PROVIDER=cloud (Anthropic/OpenAI) is disallowed. The start-up guard refuses to boot.


6. Deployment Environments

Environmentservice replicasworker replicasTriton CPUTriton GPUNotes
Production (kbl)3-10 (HPA)2-20 (KEDA)32 × T4Full feature set
Production (mzr)20 (paused)20Score gRPC only; pipelines paused
Staging2211 (shared)Daily synthetic load
Development111 (CPU only)mockDockerised; no GPU
CI11mockmockDeterministic responses

7. Image Tagging & CI/CD

  • Image: ghcr.io/ghasi/fraud-intel-service:<git-sha>
  • Helm chart: charts/fraud-intel-service versioned alongside.
  • Argo CD application: apps/fraud-intel-service.yaml with sync wave 4 (after compliance-engine, before NOC dashboard).
  • Canary: Argo Rollouts with 10% → 25% → 50% → 100% over 30 min, checking SLO burn rate at each step.
  • Rollback: automatic on FraudScoreP95High or FraudScoreUnavailable firing during canary.

8. Resource Budget Summary

ComponentCPU reqMemory reqGPUPodsTotal CPUTotal Mem
fraud-intel-service11 Gi3-103-103-10 Gi
fraud-intel-worker416 Gi2-208-8032-320 Gi
triton-fraud-cpu48 Gi31224 Gi
triton-fraud-gpu416 Gi1 × T42832 Gi
Steady-state (median)~50 vCPU~150 Gi
Burst~110 vCPU~390 Gi