Compliance Layer — Deployment Topology

Status: populated | Last updated: 2026-04-18

1. Kubernetes Resources

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: compliance-engine
  namespace: sms-platform
spec:
  replicas: 3
  selector:
    matchLabels:
      app: compliance-engine
  template:
    metadata:
      labels:
        app: compliance-engine
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "3002"
        prometheus.io/path: "/metrics"
    spec:
      containers:
        - name: compliance-engine
          image: ghcr.io/ghasi/compliance-engine:latest
          ports:
            - containerPort: 50052   # gRPC
              name: grpc
            - containerPort: 3002    # HTTP (metrics, health, REST)
              name: http
          env:
            - name: NODE_ENV
              value: production
            - name: LOG_LEVEL
              value: info
            - name: GRPC_PORT
              value: "50052"
            - name: HTTP_PORT
              value: "3002"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: compliance-engine-db-secret
                  key: url
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: compliance-engine-redis-secret
                  key: url
            - name: NATS_URL
              valueFrom:
                secretKeyRef:
                  name: nats-credentials
                  key: url
            # AI provider — local LLM primary
            - name: AI_PROVIDER
              value: local
            - name: LOCAL_LLM_URL
              value: http://local-llm-service.sms-platform.svc.cluster.local:8000
            - name: LOCAL_LLM_MODEL
              value: llama-3.1-8b-instruct-awq
            - name: ANONYMIZE_BODY_BEFORE_AI
              value: "true"
            # External LLM failover (optional)
            - name: AI_FAILOVER_PROVIDER
              value: ""   # set to 'claude' or 'openai' to enable failover
            # Budget / timing
            - name: EVAL_BUDGET_MS
              value: "450"
            - name: AI_TIMEOUT_MS
              value: "2000"
            - name: HOLD_QUEUE_TTL_HOURS
              value: "24"
            - name: SCORING_INTERVAL_MINUTES
              value: "15"
          envFrom:
            - secretRef:
                name: compliance-engine-vault-secrets
          resources:
            requests:
              cpu: 500m
              memory: 512Mi
            limits:
              cpu: 2000m
              memory: 1Gi
          livenessProbe:
            httpGet:
              path: /health
              port: http
            initialDelaySeconds: 15
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: http
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 3
          volumeMounts:
            - name: tls-certs
              mountPath: /etc/tls
              readOnly: true
      volumes:
        - name: tls-certs
          secret:
            secretName: compliance-engine-tls

Local LLM Deployment (separate, GPU-backed)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: local-llm
  namespace: sms-platform
spec:
  replicas: 2
  selector:
    matchLabels:
      app: local-llm
  template:
    metadata:
      labels:
        app: local-llm
    spec:
      nodeSelector:
        gpu: "true"
      containers:
        - name: vllm
          image: vllm/vllm-openai:latest
          args:
            - "--model=casperhansen/llama-3.1-8b-instruct-awq"
            - "--quantization=awq"
            - "--max-model-len=4096"
            - "--gpu-memory-utilization=0.85"
          ports:
            - containerPort: 8000
              name: http
          resources:
            requests:
              cpu: 4
              memory: 16Gi
              nvidia.com/gpu: 1
            limits:
              cpu: 8
              memory: 24Gi
              nvidia.com/gpu: 1
          livenessProbe:
            httpGet:
              path: /health
              port: http
            initialDelaySeconds: 120  # model load time
            periodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
  name: local-llm-service
  namespace: sms-platform
spec:
  selector:
    app: local-llm
  ports:
    - port: 8000
      targetPort: http
  type: ClusterIP

Horizontal Pod Autoscaler (compliance-engine)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: compliance-engine-hpa
  namespace: sms-platform
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: compliance-engine
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 65
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 75
    - type: Pods
      pods:
        metric:
          name: compliance_evaluation_duration_seconds_p95
        target:
          type: AverageValue
          averageValue: "0.4"    # scale up if P95 approaches 400 ms

PodDisruptionBudget

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: compliance-engine-pdb
  namespace: sms-platform
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: compliance-engine
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: local-llm-pdb
  namespace: sms-platform
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: local-llm

Services

apiVersion: v1
kind: Service
metadata:
  name: compliance-engine-grpc
  namespace: sms-platform
spec:
  selector:
    app: compliance-engine
  ports:
    - name: grpc
      port: 50052
      targetPort: grpc
  type: ClusterIP
---
apiVersion: v1
kind: Service
metadata:
  name: compliance-engine-http
  namespace: sms-platform
spec:
  selector:
    app: compliance-engine
  ports:
    - name: http
      port: 3002
      targetPort: http
  type: ClusterIP

NetworkPolicy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: compliance-engine-netpol
  namespace: sms-platform
spec:
  podSelector:
    matchLabels:
      app: compliance-engine
  policyTypes: [Ingress, Egress]
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: sms-orchestrator
      ports:
        - port: 50052
    - from:
        - podSelector:
            matchLabels:
              app: admin-dashboard
      ports:
        - port: 3002
    - from:
        - namespaceSelector:
            matchLabels:
              name: monitoring
      ports:
        - port: 3002
  egress:
    - to:
        - podSelector:
            matchLabels: { app: postgresql }
      ports: [{ port: 5432 }]
    - to:
        - podSelector:
            matchLabels: { app: redis }
      ports: [{ port: 6379 }]
    - to:
        - podSelector:
            matchLabels: { app: nats }
      ports: [{ port: 4222 }]
    - to:
        - podSelector:
            matchLabels: { app: local-llm }
      ports: [{ port: 8000 }]
    # External LLM egress (only when AI_FAILOVER_PROVIDER is set)
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
            except: [10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16]
      ports: [{ port: 443 }]

2. Background Workers

Worker	Schedule	Description
`TenantScoringWorker`	Every 15 min	Recalculates compliance scores for all active tenants
`HoldQueueExpiryWorker`	Every 5 min	Auto-expires PENDING holds past their `auto_expires_at`
`PartitionMaintenanceWorker`	Daily at 03:00 UTC	Creates next month's evaluation_log + score_history partitions
`KeywordListReloadWorker`	Every 5 min	Reloads keyword sets into process memory if DB version changed
`DlrStatsRollupWorker`	Every hour	Rolls up per-window DLR stats and purges expired windows

Workers use Redis distributed locks for multi-replica safety (SET NX EX on lock:worker:{name}).

3. Infrastructure Dependencies

Dependency	Version	Topology
PostgreSQL	15+	Primary + read replica
Redis	7.0+	Cluster mode; compliance-engine uses DB 3
NATS JetStream	2.10+	3-node cluster; `COMPLIANCE_EVENTS` and `SMS_DLR` streams
Local LLM	vLLM 0.5+	Separate deployment with GPU nodes
External LLM API (optional)	Claude v1 / OpenAI v1	External HTTPS

4. Environment Variables

Variable	Required	Default	Description
`NODE_ENV`	Yes	—	`production` / `staging` / `development`
`GRPC_PORT`	No	`50052`	gRPC listener port
`HTTP_PORT`	No	`3002`	HTTP listener port
`DATABASE_URL`	Yes	—	PostgreSQL connection string
`REDIS_URL`	Yes	—	Redis connection string
`NATS_URL`	Yes	—	NATS server URL
`NATS_CREDS_PATH`	Yes	—	Path to NATS credentials file
`AI_PROVIDER`	No	`local`	`local` / `claude` / `openai` / `mock`
`LOCAL_LLM_URL`	If AI_PROVIDER=local	—	URL to local LLM OpenAI-compatible endpoint
`LOCAL_LLM_MODEL`	If AI_PROVIDER=local	—	Model name to pass in requests
`AI_FAILOVER_PROVIDER`	No	`""` (disabled)	`claude` / `openai` / `""`
`AI_API_KEY`	If external LLM	—	External LLM provider API key (from Vault)
`AI_MODEL`	No	—	External LLM model (e.g., `claude-haiku-4-5-20251001`)
`ANONYMIZE_BODY_BEFORE_AI`	No	`true`	Redact PII before inference
`AI_TIMEOUT_MS`	No	`2000`	LLM call timeout
`EVAL_BUDGET_MS`	No	`450`	Per-evaluation internal budget
`GRPC_TLS_ENABLED`	No	`true`	Set `false` for local dev
`TLS_CERT_PATH`	If TLS	—	Path to server TLS certificate
`TLS_KEY_PATH`	If TLS	—	Path to server TLS private key
`TLS_CA_PATH`	If TLS	—	Path to CA bundle for mTLS
`LOG_LEVEL`	No	`info`	`debug` / `info` / `warn` / `error`
`HOLD_QUEUE_TTL_HOURS`	No	`24`	Auto-expiry duration for held messages
`SCORING_INTERVAL_MINUTES`	No	`15`	Tenant scoring cycle interval

Note: COMPLIANCE_FAILURE_MODE is intentionally removed — the Compliance Layer is always fail-closed. This is architectural, not configurable.

5. Deployment Environments

Environment	compliance-engine replicas	Local LLM	External LLM failover	Notes
Production	3–20 (HPA)	2 × A10 GPU	Optional, disabled by default	Fail-closed always
Staging	2	1 × A10 GPU (shared)	Claude Haiku (disabled by default)
Development	1	Ollama (local workstation) or Mock	Mock	No GPU required
CI	1	Mock	Mock	Deterministic test responses

1. Kubernetes Resources​

Deployment​

Local LLM Deployment (separate, GPU-backed)​

Horizontal Pod Autoscaler (compliance-engine)​

PodDisruptionBudget​

Services​

NetworkPolicy​

2. Background Workers​

3. Infrastructure Dependencies​

4. Environment Variables​

5. Deployment Environments​

1. Kubernetes Resources

Deployment

Local LLM Deployment (separate, GPU-backed)

Horizontal Pod Autoscaler (compliance-engine)

PodDisruptionBudget

Services

NetworkPolicy

2. Background Workers

3. Infrastructure Dependencies

4. Environment Variables

5. Deployment Environments