Compliance Layer — Application Logic

Status: populated | Last updated: 2026-04-18

1. Use Cases

UC-01: EvaluateCompliance (gRPC handler — async pipeline)

Trigger: sms-orchestrator's NATS consumer calls ComplianceService/EvaluateCompliance for every message after it is dequeued from sms.outbound.request.

Input: MessageContext (messageId, tenantId, accountId, to, senderId, body, messageType, segments, encoding, idempotencyKey, metadata)

Output: EvaluateComplianceResponse (evaluationId, verdict, findings[], ruleSetId, evaluationLatencyMs, holdId?)

SLA: P95 ≤ 500 ms (async pipeline — no tenant waiting on HTTP response)

Note: Since the tenant has already received a 202 response, this call is on the platform's internal async path. The 500 ms SLA is an operational budget for throughput planning, not user-perceived latency.

Steps:

Input validation. Reject with INVALID_ARGUMENT if any required field is missing or malformed.
Deduplication check. Compute MessageFingerprint = SHA-256(accountId:senderId:to:body). Attempt GET eval:cache:{fingerprint} from Redis (TTL 5 min). If HIT, return cached verdict immediately.
Load tenant risk state. Attempt GET tenant:risk:{tenantId} from Redis (TTL 60 s). On MISS, query compliance.tenant_compliance_scores for (tenantId). If no score exists yet (new tenant), treat as CLEAR tier.
Load applicable rule sets. Attempt GET ruleset:{tenantId}:{accountId} from Redis (TTL 300 s). On MISS:
- Load tenant-specific rule set assignments from compliance.tenant_rule_set_assignments
- Load default rule set (if any is marked is_default = true)
- Merge into a single ordered rule list (tenant-specific rules take priority over default)
- Write to Redis cache
Evaluate ALLOW rules first (action = ALLOW, ordered by priority ASC):
- If any ALLOW rule matches the MessageContext, return verdict = ALLOW immediately.
- ALLOW rules whitelist trusted sender IDs, approved templates, or specific account segments.
Auto-HOLD for SUSPENDED tenants. If tenant risk tier = SUSPENDED, skip rule evaluation and proceed directly to step 10 with verdict = HOLD and reason = "tenant_suspended".
Evaluate BLOCK and HOLD rules (ordered by priority ASC, BLOCK evaluated before HOLD at same priority):
- For each active rule, evaluate all conditions against MessageContext (AND logic within a rule)
- On first BLOCK match: record finding; continue evaluating only FLAG/ALERT rules
- On first HOLD match (no BLOCK found): record finding; continue evaluating only FLAG/ALERT rules
- AI_CLASSIFICATION rules: see step 8
- COMPOSITE rules: resolve child rules recursively (max depth 5; fail-closed on cycle detection)
AI classification (only when AI_CLASSIFICATION rules are present):
- Compute body_hash = SHA-256(body) (post-anonymisation)
- Attempt GET ai:cache:{body_hash} from Redis (TTL 24 h)
- On MISS: call local LLM; receive category → confidence map; cache result
- Compare each category's confidence against rule's minConfidence threshold
- On LLM unavailable: apply rule's fallbackAction (HOLD is the default and recommended — fail-closed)
Evaluate FLAG and ALERT rules. Always evaluated; annotate the findings but do not change the primary verdict.
Determine final verdict:
- BLOCK if any BLOCK rule matched
- HOLD if no BLOCK but at least one HOLD rule matched (or tenant is SUSPENDED)
- FLAG if no BLOCK/HOLD but at least one FLAG rule matched
- ALLOW otherwise
Side-effects (synchronous, part of gRPC response):
- Write evaluation_log row to PostgreSQL
- If verdict = HOLD: insert hold_queue row; populate holdId in response
- Update evaluation result cache: SET eval:cache:{fingerprint} EX 300
Side-effects (asynchronous, fire-and-forget):
- Publish compliance.audit.v1 event
- If verdict = HOLD: publish compliance.message.held event
- If verdict = BLOCK: publish compliance.message.blocked event
- Increment Prometheus counters
Return EvaluateComplianceResponse to sms-orchestrator.

Error codes:

gRPC status	Condition
`INVALID_ARGUMENT`	Missing required field; malformed UUID or phone number
`INTERNAL`	Unhandled internal error (logged, not exposed). The NATS consumer treats this as a transient failure and retries.

Fail-closed behaviour: On INTERNAL or gRPC deadline exceeded, the sms-orchestrator NATS consumer does not ACK the message. NATS JetStream redelivers after the ack wait (30 s). After 3 delivery attempts, the message moves to sms.outbound.deadletter with reason compliance_unavailable. The message is never dispatched to a carrier.

UC-02: Orchestrator Consumer — Message State Handler

Trigger: sms-orchestrator NATS consumer, after receiving a compliance verdict.

Steps:

Verdict	Orchestrator action	sms_messages status	Tenant notification
`ALLOW`	Continue to routing-engine	`ROUTING`	none (success path)
`FLAG`	Continue to routing-engine with annotation	`ROUTING` (with `flagged: true`)	none
`BLOCK`	Do not route; mark terminal	`BLOCKED`	"Message blocked" alert via web portal
`HOLD`	Do not route; await review	`ON_HOLD` (with `holdId`)	"Message held for review" alert via web portal

All state transitions write sms.events.status events for the audit trail.

UC-03: ReviewHeldMessage (Admin REST API)

Trigger: Platform admin or auditor invokes POST /compliance/hold-queue/{holdId}/review Authorization: Role platform.compliance.reviewer or platform.compliance.admin

Input: { action: 'RELEASE' | 'REJECT', notes: string }

Steps:

Load HeldMessage from compliance.hold_queue by holdId. Return 404 if not found.
Verify status = PENDING or REVIEWING. Return 409 Conflict if already reviewed/expired.
Update status to REVIEWING (optimistic lock).
If action = RELEASE:
- Update sms_messages status: ON_HOLD → ROUTING (with complianceOverride: true, releasedBy, releasedAt)
- Publish sms.outbound.retry NATS event (sms-orchestrator re-consumes and proceeds directly to routing — compliance re-evaluation is skipped on release unless FORCE_RECHECK_ON_RELEASE=true)
- Update hold_queue status → REVIEWED_RELEASED
- Publish compliance.message.released NATS event
If action = REJECT:
- Update sms_messages status: ON_HOLD → BLOCKED (terminal)
- Update hold_queue status → REVIEWED_REJECTED
- Publish compliance.message.rejected NATS event
Write compliance.audit_log entry with before/after state, actor, IP, timestamp.
Decrement hold:queue:{tenantId}:size Redis counter.
notification-service consumes event and pushes alert to tenant's web portal.

Release pathway rationale: Once a human reviewer has approved release, re-evaluating compliance would create a loop (same rules would hold the message again). The reviewer's decision is the authoritative override, recorded in the audit log.

UC-04: ManageComplianceRule (Admin REST CRUD)

Authorization: Role platform.compliance.admin

Create rule:

Validate rule schema (type-specific config validation, regex compilation check, composite cycle check).
Insert into compliance.rules; insert initial row into compliance.rule_versions.
Invalidate Redis rule set caches for affected tenants.
Publish compliance.rule.changed to NATS.
Write audit log.

Update rule: Increment version; insert new rule_versions snapshot; invalidate caches; publish event; write audit log.

Enable / Disable rule: Set is_active; invalidate caches; publish event; write audit log.

Delete rule: Soft-delete (is_active = false, deleted_at set). Rules referenced by COMPOSITE rules cannot be deleted.

UC-05: RecalculateTenantScores (Background Worker — every 15 min)

Trigger: Internal cron (@Cron('*/15 * * * *')) with distributed Redis lock for multi-replica safety.

For each tenant with activity in the past 7 days:

Aggregate metrics from evaluation_log, dlr_stats, and operator feedback tables.

Compute score dimensions:

contentScore   = max(0, 25 × (1 - violations_7d / max(messages_sent_7d, 1)))
volumeScore    = max(0, 20 × (1 - rate_limit_violations_7d / max(messages_sent_7d, 1)))
dlrScore       = 20 × dlr_success_rate
optoutScore    = 15 × (1 - min(optout_rate / 0.05, 1))
complaintScore = 10 × (1 - min(complaint_rate / 0.01, 1))
tenureScore    = min(10, account_age_days / 90)
overallScore   = sum of all dimensions

Determine risk tier from overallScore: 80–100 CLEAR · 60–79 MONITOR · 30–59 RESTRICTED · 0–29 SUSPENDED.
Detect tier transitions. On tier change: publish compliance.tenant.tier.changed; notify tenant admin via notification-service; if transition to SUSPENDED, publish compliance.tenant.suspended.
Persist: UPSERT tenant_compliance_scores; INSERT score_history; SET tenant:risk:{tenantId} in Redis EX 900.
Emit metric: compliance_tenant_score{tenant_id} gauge.

UC-06: ConsumeDeliveryReceiptEvent (NATS Consumer)

Subject: sms.dlr.inbound · Consumer group: compliance-engine-dlr

Steps:

Deserialize DlrEvent.
Map SMPP status codes to canonical DlrStatus.
UPSERT compliance.dlr_stats for (tenantId, accountId) across three windows (1h, 24h, 7d).
ACK NATS message on successful DB write.

UC-07: GenerateComplianceReport (On-demand + scheduled)

Trigger: POST /compliance/reports or daily cron at 02:00 UTC

Report types:

TENANT_RANKING — tenants sorted by compliance score with drill-down links
VIOLATION_SUMMARY — violations by rule type, category, and trend (7/30/90 day)
HOLD_QUEUE_SUMMARY — pending / released / rejected / expired counts
TIER_TRANSITIONS — tenants that changed tier in the period
TOP_TRIGGERED_RULES — rules ranked by match count
TENANT_AUDIT — full compliance evidence for a single tenant (regulatory export)

Output formats: JSON (API), CSV (export), PDF (future).

2. Rule Evaluation Performance Optimisation

Fast-path checks (no external calls)

Ordered first to enable early termination:

GEO_RESTRICTION — country code from E.164 prefix
TEMPORAL — current time in rule timezone
SENDER_ID simple string checks
KEYWORD exact-match (pre-loaded keyword sets in process memory)

Medium-path checks (Redis only)

RATE_VOLUME — Redis sliding window counters
DLR_ABUSE — Redis projection of compliance.dlr_stats

Slow-path checks (DB or external API)

REGEX (re2 engine, linear time)
AI_CLASSIFICATION — local LLM call (cache-checked first)
COMPOSITE — recursive child rule evaluation

Budget enforcement

Each EvaluateCompliance call has a 450 ms internal budget (leaving 50 ms margin for gRPC + response serialisation):

If budget is exhausted mid-evaluation, remaining slow-path rules are skipped with a FLAG finding (evidence: "skipped_budget_exceeded")
Fail-closed applies to budget exhaustion for HOLD-eligible rules: if a HOLD-configured AI rule cannot run due to budget and its fallbackAction is HOLD, the verdict becomes HOLD
Budget violations emit compliance_evaluation_budget_exceeded_total metric

3. Hold Queue Priority Scoring

When a message is placed in the hold queue, its review_priority is computed so auditors see the most important items first:

priority = (tenant_risk_score_inverse × 40)    // lower tenant score → higher urgency
         + (rule_severity_weight × 35)          // BLOCK-triggering rules → higher urgency
         + (volume_spike_indicator × 15)        // spike → higher urgency
         + (recency_decay × 10)                 // newer → higher urgency

Rule severity weights:

TERRORISM, PHISHING → weight 10
SPAM, FINANCIAL_FRAUD → weight 8
ADULT_CONTENT, GAMBLING → weight 6
other rule types → weight 4

1. Use Cases​

UC-01: EvaluateCompliance (gRPC handler — async pipeline)​

UC-02: Orchestrator Consumer — Message State Handler​

UC-03: ReviewHeldMessage (Admin REST API)​

UC-04: ManageComplianceRule (Admin REST CRUD)​

UC-05: RecalculateTenantScores (Background Worker — every 15 min)​

UC-06: ConsumeDeliveryReceiptEvent (NATS Consumer)​

UC-07: GenerateComplianceReport (On-demand + scheduled)​

2. Rule Evaluation Performance Optimisation​

Fast-path checks (no external calls)​

Medium-path checks (Redis only)​

Slow-path checks (DB or external API)​

Budget enforcement​

3. Hold Queue Priority Scoring​

1. Use Cases

UC-01: EvaluateCompliance (gRPC handler — async pipeline)

UC-02: Orchestrator Consumer — Message State Handler

UC-03: ReviewHeldMessage (Admin REST API)

UC-04: ManageComplianceRule (Admin REST CRUD)

UC-05: RecalculateTenantScores (Background Worker — every 15 min)

UC-06: ConsumeDeliveryReceiptEvent (NATS Consumer)

UC-07: GenerateComplianceReport (On-demand + scheduled)

2. Rule Evaluation Performance Optimisation

Fast-path checks (no external calls)

Medium-path checks (Redis only)

Slow-path checks (DB or external API)

Budget enforcement

3. Hold Queue Priority Scoring