Skip to main content

Compliance Layer — Deployment Topology

Status: populated | Last updated: 2026-04-18

1. Kubernetes Resources

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
name: compliance-engine
namespace: sms-platform
spec:
replicas: 3
selector:
matchLabels:
app: compliance-engine
template:
metadata:
labels:
app: compliance-engine
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "3002"
prometheus.io/path: "/metrics"
spec:
containers:
- name: compliance-engine
image: ghcr.io/ghasi/compliance-engine:latest
ports:
- containerPort: 50052 # gRPC
name: grpc
- containerPort: 3002 # HTTP (metrics, health, REST)
name: http
env:
- name: NODE_ENV
value: production
- name: LOG_LEVEL
value: info
- name: GRPC_PORT
value: "50052"
- name: HTTP_PORT
value: "3002"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: compliance-engine-db-secret
key: url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: compliance-engine-redis-secret
key: url
- name: NATS_URL
valueFrom:
secretKeyRef:
name: nats-credentials
key: url
# AI provider — local LLM primary
- name: AI_PROVIDER
value: local
- name: LOCAL_LLM_URL
value: http://local-llm-service.sms-platform.svc.cluster.local:8000
- name: LOCAL_LLM_MODEL
value: llama-3.1-8b-instruct-awq
- name: ANONYMIZE_BODY_BEFORE_AI
value: "true"
# External LLM failover (optional)
- name: AI_FAILOVER_PROVIDER
value: "" # set to 'claude' or 'openai' to enable failover
# Budget / timing
- name: EVAL_BUDGET_MS
value: "450"
- name: AI_TIMEOUT_MS
value: "2000"
- name: HOLD_QUEUE_TTL_HOURS
value: "24"
- name: SCORING_INTERVAL_MINUTES
value: "15"
envFrom:
- secretRef:
name: compliance-engine-vault-secrets
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 1Gi
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 15
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3
volumeMounts:
- name: tls-certs
mountPath: /etc/tls
readOnly: true
volumes:
- name: tls-certs
secret:
secretName: compliance-engine-tls

Local LLM Deployment (separate, GPU-backed)

apiVersion: apps/v1
kind: Deployment
metadata:
name: local-llm
namespace: sms-platform
spec:
replicas: 2
selector:
matchLabels:
app: local-llm
template:
metadata:
labels:
app: local-llm
spec:
nodeSelector:
gpu: "true"
containers:
- name: vllm
image: vllm/vllm-openai:latest
args:
- "--model=casperhansen/llama-3.1-8b-instruct-awq"
- "--quantization=awq"
- "--max-model-len=4096"
- "--gpu-memory-utilization=0.85"
ports:
- containerPort: 8000
name: http
resources:
requests:
cpu: 4
memory: 16Gi
nvidia.com/gpu: 1
limits:
cpu: 8
memory: 24Gi
nvidia.com/gpu: 1
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 120 # model load time
periodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
name: local-llm-service
namespace: sms-platform
spec:
selector:
app: local-llm
ports:
- port: 8000
targetPort: http
type: ClusterIP

Horizontal Pod Autoscaler (compliance-engine)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: compliance-engine-hpa
namespace: sms-platform
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: compliance-engine
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 65
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
- type: Pods
pods:
metric:
name: compliance_evaluation_duration_seconds_p95
target:
type: AverageValue
averageValue: "0.4" # scale up if P95 approaches 400 ms

PodDisruptionBudget

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: compliance-engine-pdb
namespace: sms-platform
spec:
minAvailable: 2
selector:
matchLabels:
app: compliance-engine
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: local-llm-pdb
namespace: sms-platform
spec:
minAvailable: 1
selector:
matchLabels:
app: local-llm

Services

apiVersion: v1
kind: Service
metadata:
name: compliance-engine-grpc
namespace: sms-platform
spec:
selector:
app: compliance-engine
ports:
- name: grpc
port: 50052
targetPort: grpc
type: ClusterIP
---
apiVersion: v1
kind: Service
metadata:
name: compliance-engine-http
namespace: sms-platform
spec:
selector:
app: compliance-engine
ports:
- name: http
port: 3002
targetPort: http
type: ClusterIP

NetworkPolicy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: compliance-engine-netpol
namespace: sms-platform
spec:
podSelector:
matchLabels:
app: compliance-engine
policyTypes: [Ingress, Egress]
ingress:
- from:
- podSelector:
matchLabels:
app: sms-orchestrator
ports:
- port: 50052
- from:
- podSelector:
matchLabels:
app: admin-dashboard
ports:
- port: 3002
- from:
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- port: 3002
egress:
- to:
- podSelector:
matchLabels: { app: postgresql }
ports: [{ port: 5432 }]
- to:
- podSelector:
matchLabels: { app: redis }
ports: [{ port: 6379 }]
- to:
- podSelector:
matchLabels: { app: nats }
ports: [{ port: 4222 }]
- to:
- podSelector:
matchLabels: { app: local-llm }
ports: [{ port: 8000 }]
# External LLM egress (only when AI_FAILOVER_PROVIDER is set)
- to:
- ipBlock:
cidr: 0.0.0.0/0
except: [10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16]
ports: [{ port: 443 }]

2. Background Workers

WorkerScheduleDescription
TenantScoringWorkerEvery 15 minRecalculates compliance scores for all active tenants
HoldQueueExpiryWorkerEvery 5 minAuto-expires PENDING holds past their auto_expires_at
PartitionMaintenanceWorkerDaily at 03:00 UTCCreates next month's evaluation_log + score_history partitions
KeywordListReloadWorkerEvery 5 minReloads keyword sets into process memory if DB version changed
DlrStatsRollupWorkerEvery hourRolls up per-window DLR stats and purges expired windows

Workers use Redis distributed locks for multi-replica safety (SET NX EX on lock:worker:{name}).


3. Infrastructure Dependencies

DependencyVersionTopology
PostgreSQL15+Primary + read replica
Redis7.0+Cluster mode; compliance-engine uses DB 3
NATS JetStream2.10+3-node cluster; COMPLIANCE_EVENTS and SMS_DLR streams
Local LLMvLLM 0.5+Separate deployment with GPU nodes
External LLM API (optional)Claude v1 / OpenAI v1External HTTPS

4. Environment Variables

VariableRequiredDefaultDescription
NODE_ENVYesproduction / staging / development
GRPC_PORTNo50052gRPC listener port
HTTP_PORTNo3002HTTP listener port
DATABASE_URLYesPostgreSQL connection string
REDIS_URLYesRedis connection string
NATS_URLYesNATS server URL
NATS_CREDS_PATHYesPath to NATS credentials file
AI_PROVIDERNolocallocal / claude / openai / mock
LOCAL_LLM_URLIf AI_PROVIDER=localURL to local LLM OpenAI-compatible endpoint
LOCAL_LLM_MODELIf AI_PROVIDER=localModel name to pass in requests
AI_FAILOVER_PROVIDERNo"" (disabled)claude / openai / ""
AI_API_KEYIf external LLMExternal LLM provider API key (from Vault)
AI_MODELNoExternal LLM model (e.g., claude-haiku-4-5-20251001)
ANONYMIZE_BODY_BEFORE_AINotrueRedact PII before inference
AI_TIMEOUT_MSNo2000LLM call timeout
EVAL_BUDGET_MSNo450Per-evaluation internal budget
GRPC_TLS_ENABLEDNotrueSet false for local dev
TLS_CERT_PATHIf TLSPath to server TLS certificate
TLS_KEY_PATHIf TLSPath to server TLS private key
TLS_CA_PATHIf TLSPath to CA bundle for mTLS
LOG_LEVELNoinfodebug / info / warn / error
HOLD_QUEUE_TTL_HOURSNo24Auto-expiry duration for held messages
SCORING_INTERVAL_MINUTESNo15Tenant scoring cycle interval

Note: COMPLIANCE_FAILURE_MODE is intentionally removed — the Compliance Layer is always fail-closed. This is architectural, not configurable.


5. Deployment Environments

Environmentcompliance-engine replicasLocal LLMExternal LLM failoverNotes
Production3–20 (HPA)2 × A10 GPUOptional, disabled by defaultFail-closed always
Staging21 × A10 GPU (shared)Claude Haiku (disabled by default)
Development1Ollama (local workstation) or MockMockNo GPU required
CI1MockMockDeterministic test responses