Compliance Layer — Application Logic
Status: populated | Last updated: 2026-04-18
1. Use Cases
UC-01: EvaluateCompliance (gRPC handler — async pipeline)
Trigger: sms-orchestrator's NATS consumer calls ComplianceService/EvaluateCompliance for every message after it is dequeued from sms.outbound.request.
Input: MessageContext (messageId, tenantId, accountId, to, senderId, body, messageType, segments, encoding, idempotencyKey, metadata)
Output: EvaluateComplianceResponse (evaluationId, verdict, findings[], ruleSetId, evaluationLatencyMs, holdId?)
SLA: P95 ≤ 500 ms (async pipeline — no tenant waiting on HTTP response)
Note: Since the tenant has already received a 202 response, this call is on the platform's internal async path. The 500 ms SLA is an operational budget for throughput planning, not user-perceived latency.
Steps:
-
Input validation. Reject with
INVALID_ARGUMENTif any required field is missing or malformed. -
Deduplication check. Compute
MessageFingerprint= SHA-256(accountId:senderId:to:body). AttemptGET eval:cache:{fingerprint}from Redis (TTL 5 min). If HIT, return cached verdict immediately. -
Load tenant risk state. Attempt
GET tenant:risk:{tenantId}from Redis (TTL 60 s). On MISS, querycompliance.tenant_compliance_scoresfor(tenantId). If no score exists yet (new tenant), treat as CLEAR tier. -
Load applicable rule sets. Attempt
GET ruleset:{tenantId}:{accountId}from Redis (TTL 300 s). On MISS:- Load tenant-specific rule set assignments from
compliance.tenant_rule_set_assignments - Load default rule set (if any is marked
is_default = true) - Merge into a single ordered rule list (tenant-specific rules take priority over default)
- Write to Redis cache
- Load tenant-specific rule set assignments from
-
Evaluate ALLOW rules first (action = ALLOW, ordered by priority ASC):
- If any ALLOW rule matches the
MessageContext, return verdict = ALLOW immediately. - ALLOW rules whitelist trusted sender IDs, approved templates, or specific account segments.
- If any ALLOW rule matches the
-
Auto-HOLD for SUSPENDED tenants. If tenant risk tier = SUSPENDED, skip rule evaluation and proceed directly to step 10 with verdict = HOLD and reason = "tenant_suspended".
-
Evaluate BLOCK and HOLD rules (ordered by priority ASC, BLOCK evaluated before HOLD at same priority):
- For each active rule, evaluate all conditions against
MessageContext(AND logic within a rule) - On first BLOCK match: record finding; continue evaluating only FLAG/ALERT rules
- On first HOLD match (no BLOCK found): record finding; continue evaluating only FLAG/ALERT rules
- AI_CLASSIFICATION rules: see step 8
- COMPOSITE rules: resolve child rules recursively (max depth 5; fail-closed on cycle detection)
- For each active rule, evaluate all conditions against
-
AI classification (only when AI_CLASSIFICATION rules are present):
- Compute
body_hash= SHA-256(body) (post-anonymisation) - Attempt
GET ai:cache:{body_hash}from Redis (TTL 24 h) - On MISS: call local LLM; receive category → confidence map; cache result
- Compare each category's confidence against rule's
minConfidencethreshold - On LLM unavailable: apply rule's
fallbackAction(HOLD is the default and recommended — fail-closed)
- Compute
-
Evaluate FLAG and ALERT rules. Always evaluated; annotate the findings but do not change the primary verdict.
-
Determine final verdict:
- BLOCK if any BLOCK rule matched
- HOLD if no BLOCK but at least one HOLD rule matched (or tenant is SUSPENDED)
- FLAG if no BLOCK/HOLD but at least one FLAG rule matched
- ALLOW otherwise
-
Side-effects (synchronous, part of gRPC response):
- Write
evaluation_logrow to PostgreSQL - If verdict = HOLD: insert
hold_queuerow; populateholdIdin response - Update evaluation result cache:
SET eval:cache:{fingerprint}EX 300
- Write
-
Side-effects (asynchronous, fire-and-forget):
- Publish
compliance.audit.v1event - If verdict = HOLD: publish
compliance.message.heldevent - If verdict = BLOCK: publish
compliance.message.blockedevent - Increment Prometheus counters
- Publish
-
Return
EvaluateComplianceResponseto sms-orchestrator.
Error codes:
| gRPC status | Condition |
|---|---|
INVALID_ARGUMENT | Missing required field; malformed UUID or phone number |
INTERNAL | Unhandled internal error (logged, not exposed). The NATS consumer treats this as a transient failure and retries. |
Fail-closed behaviour: On
INTERNALor gRPC deadline exceeded, the sms-orchestrator NATS consumer does not ACK the message. NATS JetStream redelivers after the ack wait (30 s). After 3 delivery attempts, the message moves tosms.outbound.deadletterwith reasoncompliance_unavailable. The message is never dispatched to a carrier.
UC-02: Orchestrator Consumer — Message State Handler
Trigger: sms-orchestrator NATS consumer, after receiving a compliance verdict.
Steps:
| Verdict | Orchestrator action | sms_messages status | Tenant notification |
|---|---|---|---|
ALLOW | Continue to routing-engine | ROUTING | none (success path) |
FLAG | Continue to routing-engine with annotation | ROUTING (with flagged: true) | none |
BLOCK | Do not route; mark terminal | BLOCKED | "Message blocked" alert via web portal |
HOLD | Do not route; await review | ON_HOLD (with holdId) | "Message held for review" alert via web portal |
All state transitions write sms.events.status events for the audit trail.
UC-03: ReviewHeldMessage (Admin REST API)
Trigger: Platform admin or auditor invokes POST /compliance/hold-queue/{holdId}/review
Authorization: Role platform.compliance.reviewer or platform.compliance.admin
Input: { action: 'RELEASE' | 'REJECT', notes: string }
Steps:
- Load
HeldMessagefromcompliance.hold_queuebyholdId. Return 404 if not found. - Verify status = PENDING or REVIEWING. Return 409 Conflict if already reviewed/expired.
- Update status to REVIEWING (optimistic lock).
- If
action = RELEASE:- Update
sms_messagesstatus:ON_HOLD→ROUTING(withcomplianceOverride: true,releasedBy,releasedAt) - Publish
sms.outbound.retryNATS event (sms-orchestrator re-consumes and proceeds directly to routing — compliance re-evaluation is skipped on release unlessFORCE_RECHECK_ON_RELEASE=true) - Update hold_queue status → REVIEWED_RELEASED
- Publish
compliance.message.releasedNATS event
- Update
- If
action = REJECT:- Update
sms_messagesstatus:ON_HOLD→BLOCKED(terminal) - Update hold_queue status → REVIEWED_REJECTED
- Publish
compliance.message.rejectedNATS event
- Update
- Write
compliance.audit_logentry with before/after state, actor, IP, timestamp. - Decrement
hold:queue:{tenantId}:sizeRedis counter. notification-serviceconsumes event and pushes alert to tenant's web portal.
Release pathway rationale: Once a human reviewer has approved release, re-evaluating compliance would create a loop (same rules would hold the message again). The reviewer's decision is the authoritative override, recorded in the audit log.
UC-04: ManageComplianceRule (Admin REST CRUD)
Authorization: Role platform.compliance.admin
Create rule:
- Validate rule schema (type-specific config validation, regex compilation check, composite cycle check).
- Insert into
compliance.rules; insert initial row intocompliance.rule_versions. - Invalidate Redis rule set caches for affected tenants.
- Publish
compliance.rule.changedto NATS. - Write audit log.
Update rule: Increment version; insert new rule_versions snapshot; invalidate caches; publish event; write audit log.
Enable / Disable rule: Set is_active; invalidate caches; publish event; write audit log.
Delete rule: Soft-delete (is_active = false, deleted_at set). Rules referenced by COMPOSITE rules cannot be deleted.
UC-05: RecalculateTenantScores (Background Worker — every 15 min)
Trigger: Internal cron (@Cron('*/15 * * * *')) with distributed Redis lock for multi-replica safety.
For each tenant with activity in the past 7 days:
-
Aggregate metrics from
evaluation_log,dlr_stats, and operator feedback tables. -
Compute score dimensions:
contentScore = max(0, 25 × (1 - violations_7d / max(messages_sent_7d, 1)))volumeScore = max(0, 20 × (1 - rate_limit_violations_7d / max(messages_sent_7d, 1)))dlrScore = 20 × dlr_success_rateoptoutScore = 15 × (1 - min(optout_rate / 0.05, 1))complaintScore = 10 × (1 - min(complaint_rate / 0.01, 1))tenureScore = min(10, account_age_days / 90)overallScore = sum of all dimensions -
Determine risk tier from
overallScore: 80–100 CLEAR · 60–79 MONITOR · 30–59 RESTRICTED · 0–29 SUSPENDED. -
Detect tier transitions. On tier change: publish
compliance.tenant.tier.changed; notify tenant admin vianotification-service; if transition to SUSPENDED, publishcompliance.tenant.suspended. -
Persist: UPSERT
tenant_compliance_scores; INSERTscore_history;SET tenant:risk:{tenantId}in Redis EX 900. -
Emit metric:
compliance_tenant_score{tenant_id}gauge.
UC-06: ConsumeDeliveryReceiptEvent (NATS Consumer)
Subject: sms.dlr.inbound · Consumer group: compliance-engine-dlr
Steps:
- Deserialize
DlrEvent. - Map SMPP status codes to canonical
DlrStatus. - UPSERT
compliance.dlr_statsfor(tenantId, accountId)across three windows (1h, 24h, 7d). - ACK NATS message on successful DB write.
UC-07: GenerateComplianceReport (On-demand + scheduled)
Trigger: POST /compliance/reports or daily cron at 02:00 UTC
Report types:
- TENANT_RANKING — tenants sorted by compliance score with drill-down links
- VIOLATION_SUMMARY — violations by rule type, category, and trend (7/30/90 day)
- HOLD_QUEUE_SUMMARY — pending / released / rejected / expired counts
- TIER_TRANSITIONS — tenants that changed tier in the period
- TOP_TRIGGERED_RULES — rules ranked by match count
- TENANT_AUDIT — full compliance evidence for a single tenant (regulatory export)
Output formats: JSON (API), CSV (export), PDF (future).
2. Rule Evaluation Performance Optimisation
Fast-path checks (no external calls)
Ordered first to enable early termination:
- GEO_RESTRICTION — country code from E.164 prefix
- TEMPORAL — current time in rule timezone
- SENDER_ID simple string checks
- KEYWORD exact-match (pre-loaded keyword sets in process memory)
Medium-path checks (Redis only)
- RATE_VOLUME — Redis sliding window counters
- DLR_ABUSE — Redis projection of
compliance.dlr_stats
Slow-path checks (DB or external API)
- REGEX (re2 engine, linear time)
- AI_CLASSIFICATION — local LLM call (cache-checked first)
- COMPOSITE — recursive child rule evaluation
Budget enforcement
Each EvaluateCompliance call has a 450 ms internal budget (leaving 50 ms margin for gRPC + response serialisation):
- If budget is exhausted mid-evaluation, remaining slow-path rules are skipped with a FLAG finding (
evidence: "skipped_budget_exceeded") - Fail-closed applies to budget exhaustion for HOLD-eligible rules: if a HOLD-configured AI rule cannot run due to budget and its
fallbackActionis HOLD, the verdict becomes HOLD - Budget violations emit
compliance_evaluation_budget_exceeded_totalmetric
3. Hold Queue Priority Scoring
When a message is placed in the hold queue, its review_priority is computed so auditors see the most important items first:
priority = (tenant_risk_score_inverse × 40) // lower tenant score → higher urgency
+ (rule_severity_weight × 35) // BLOCK-triggering rules → higher urgency
+ (volume_spike_indicator × 15) // spike → higher urgency
+ (recency_decay × 10) // newer → higher urgency
Rule severity weights:
- TERRORISM, PHISHING → weight 10
- SPAM, FINANCIAL_FRAUD → weight 8
- ADULT_CONTENT, GAMBLING → weight 6
- other rule types → weight 4