compliance-engine — Testing Strategy

Status: populated | Last updated: 2026-04-18

1. Test Pyramid

                    ┌──────────────────────┐
                    │   E2E (3–5 flows)    │  Playwright or k6 smoke
                    ├──────────────────────┤
                    │  Integration (50+)   │  Real PG + Redis + mock AI
                    ├──────────────────────┤
                    │   Unit (200+)        │  Each rule type, scorer, algorithms
                    └──────────────────────┘

Coverage target: 85% line / 80% branch

2. Unit Tests

Framework: Jest + ts-jest

2.1 Rule Evaluator Tests (per rule type)

Each rule type has a dedicated test file exercising all positive, negative, and edge-case conditions.

KEYWORD rule

describe('KeywordRuleEvaluator', () => {
  it('matches when keyword found in body (case-insensitive)', async () => {
    const rule = buildRule({ type: 'KEYWORD', action: 'BLOCK', config: { keywordListIds: ['list-1'] } });
    const msg = buildMessage({ body: 'Win a CASINO bonus today!' });
    const result = await evaluator.evaluate(rule, msg, keywordSets);
    expect(result.matched).toBe(true);
    expect(result.evidence).toContain('casino');
  });

  it('does not match when body contains no listed keywords', async () => { ... });
  it('matchAll=true requires all keywords to be present', async () => { ... });
  it('returns no-match for empty body', async () => { ... });
});

REGEX rule

describe('RegexRuleEvaluator', () => {
  it('matches valid regex pattern against body', async () => { ... });
  it('rejects regex that does not compile at save time', async () => { ... });
  it('does not hang on ReDoS-vulnerable regex (times out within 10ms)', async () => { ... });
  it('negate=true matches when pattern is NOT found', async () => { ... });
});

AI_CLASSIFICATION rule

describe('AiClassificationRuleEvaluator', () => {
  it('triggers HOLD when AI confidence exceeds threshold', async () => {
    mockAiClient.classify.mockResolvedValue({ PHISHING: 0.92, SPAM: 0.1 });
    const rule = buildRule({ type: 'AI_CLASSIFICATION', config: { categories: ['PHISHING'], minConfidence: 0.75 } });
    const result = await evaluator.evaluate(rule, buildMessage());
    expect(result.matched).toBe(true);
    expect(result.confidence).toBeCloseTo(0.92);
  });

  it('applies fallbackAction=HOLD when AI service throws', async () => { ... });
  it('applies fallbackAction=SKIP when configured', async () => { ... });
  it('uses cached result on second call with same body hash', async () => { ... });
});

COMPOSITE rule

describe('CompositeRuleEvaluator', () => {
  it('AND: matches only when all children match', async () => { ... });
  it('OR: matches when any child matches', async () => { ... });
  it('NOT: inverts child result', async () => { ... });
  it('respects max depth of 5 levels', async () => { ... });
  it('detects and aborts cyclic composite rules at save time', async () => { ... });
  it('detects runtime cycle and returns FLAG finding', async () => { ... });
});

RATE_VOLUME rule

describe('RateVolumeRuleEvaluator', () => {
  it('triggers when message count exceeds window threshold', async () => { ... });
  it('does not trigger when under threshold', async () => { ... });
  it('respects scope=SENDER_ID independently per sender', async () => { ... });
  it('degrades to FLAG when Redis is unavailable', async () => { ... });
});

DLR_ABUSE rule

describe('DlrAbuseRuleEvaluator', () => {
  it('triggers when failure_rate exceeds threshold', async () => { ... });
  it('does not trigger when sample_size < minSampleSize', async () => { ... });
  it('evaluates correct window (1h vs 24h)', async () => { ... });
});

GEO_RESTRICTION, TEMPORAL, SENDER_ID, RECIPIENT rules

Each has 10–15 unit tests covering core logic, edge cases, and failure paths.

2.2 Evaluation Engine Tests

describe('ComplianceEvaluationEngine', () => {
  it('returns ALLOW immediately when matching ALLOW rule found (whitelist fast-path)', async () => { ... });
  it('returns BLOCK on first matching BLOCK rule', async () => { ... });
  it('returns HOLD when no BLOCK but HOLD rule matches', async () => { ... });
  it('returns FLAG when only FLAG rules match', async () => { ... });
  it('returns ALLOW when no rules match', async () => { ... });
  it('auto-HOLDs messages for SUSPENDED tenant regardless of rules', async () => { ... });
  it('evaluates rules in priority order (lower number first)', async () => { ... });
  it('collects all FLAG findings even after BLOCK verdict', async () => { ... });
  it('enforces evaluation budget and skips remaining rules on timeout', async () => { ... });
  it('returns cached verdict on repeated fingerprint within dedup window', async () => { ... });
});

2.3 Tenant Scoring Tests

describe('TenantScoringService', () => {
  it('computes CLEAR tier for compliant tenant', async () => { ... });
  it('computes SUSPENDED tier when score < 30', async () => { ... });
  it('detects tier transition from MONITOR to RESTRICTED', async () => { ... });
  it('does not trigger tier transition when score changes within same tier', async () => { ... });
  it('clamps score to 0–100 range', async () => { ... });
  it('applies tenure bonus correctly for new vs established accounts', async () => { ... });
});

2.4 Property-Based Tests (fast-check)

// Rule evaluation determinism: same input always produces same verdict
fc.assert(fc.asyncProperty(
  fc.record({ body: fc.string(), to: e164Arb, senderId: fc.string() }),
  async (msg) => {
    const result1 = await engine.evaluate(msg);
    const result2 = await engine.evaluate(msg);
    return result1.verdict === result2.verdict;
  }
));

// ALLOW rules always override BLOCK rules
fc.assert(fc.asyncProperty(
  arbRuleSet(),
  async (rules) => {
    // If an ALLOW rule matches, verdict must not be BLOCK or HOLD
    ...
  }
));

3. Integration Tests

Framework: Jest + real PostgreSQL (testcontainers) + real Redis (testcontainers) + mock AI client

3.1 gRPC Handler Integration

describe('EvaluateCompliance gRPC — integration', () => {
  beforeAll(async () => {
    db = await startTestPostgres();
    redis = await startTestRedis();
    await runMigrations(db);
    await seedDefaultRuleSet(db);
    server = await startComplianceEngine({ db, redis });
    client = createGrpcClient('localhost:50052');
  });

  it('returns ALLOW for clean message against default rule set', async () => {
    const response = await client.evaluateCompliance(buildRequest());
    expect(response.verdict).toBe('ALLOW');
    expect(response.findings).toHaveLength(0);
  });

  it('returns BLOCK when body matches active KEYWORD rule', async () => {
    await seedRule(db, { type: 'KEYWORD', action: 'BLOCK', keywords: ['test_block_word'] });
    const response = await client.evaluateCompliance(buildRequest({ body: 'test_block_word present' }));
    expect(response.verdict).toBe('BLOCK');
    expect(response.findings[0].ruleType).toBe('KEYWORD');
  });

  it('inserts hold_queue row when verdict is HOLD', async () => {
    await seedRule(db, { type: 'KEYWORD', action: 'HOLD', keywords: ['test_hold_word'] });
    await client.evaluateCompliance(buildRequest({ body: 'test_hold_word present' }));
    const hold = await db.query('SELECT * FROM compliance.hold_queue WHERE status = $1', ['PENDING']);
    expect(hold.rows).toHaveLength(1);
  });

  it('returns cached verdict on second call within dedup window', async () => {
    const req = buildRequest();
    await client.evaluateCompliance(req);
    const spy = jest.spyOn(ruleEngine, 'evaluate');
    await client.evaluateCompliance(req);
    expect(spy).not.toHaveBeenCalled();  // served from cache
  });

  it('returns INTERNAL (fail-closed) when DB is unavailable and rule cache is expired', async () => { ... });
});

3.2 Hold Queue Integration

describe('Hold Queue management — integration', () => {
  it('RELEASE action re-publishes message to sms.outbound.request NATS subject', async () => { ... });
  it('REJECT action updates status and does not re-publish', async () => { ... });
  it('auto-expiry job moves PENDING holds past TTL to AUTO_EXPIRED', async () => { ... });
  it('bulk-review endpoint rejects all PENDING holds for a tenant', async () => { ... });
});

3.3 Tenant Scoring Integration

describe('Tenant scoring — integration', () => {
  it('full scoring cycle updates tenant_compliance_scores and score_history', async () => { ... });
  it('publishes compliance.tenant.tier.changed event on tier transition', async () => { ... });
  it('sets Redis tenant:risk cache after scoring', async () => { ... });
});

4. Contract Tests

Verify the gRPC proto contract between compliance-engine and sms-orchestrator:

Pact or gRPC reflection-based contract: sms-orchestrator publishes its expected request/response shapes; compliance-engine verifies it can satisfy those shapes.
Run as part of CI on every PR.

5. Load Tests

Framework: k6 or ghz

5.1 Evaluation throughput

ghz --proto src/proto/compliance.proto \
    --call ghasi.sms.compliance.v1.ComplianceService.EvaluateCompliance \
    --data-file ./test/load/sample_request.json \
    --concurrency 100 \
    --total 50000 \
    --host localhost:50052

Pass criteria:

P95 latency < 80 ms (gRPC round trip)
P99 latency < 150 ms
Error rate < 0.1%
No memory leak over 10-minute run

5.2 AI classification under load

Inject 10% of requests with bodies that trigger AI_CLASSIFICATION rules
Verify AI cache hit rate > 90% for repeated OTP/template bodies
Verify AI fallback fires correctly under simulated AI service degradation

6. Security Tests

ReDoS test: Feed 100 known ReDoS patterns to the rule save API; verify all are rejected.
SQL injection: Feed SQL-injection strings as rule condition values; verify they are stored safely.
Role escalation: Verify platform.compliance.reviewer cannot call POST /compliance/rules.
Cross-tenant hold access: Verify account.admin of tenant A cannot view holds of tenant B.
Audit log immutability: Attempt UPDATE/DELETE on audit_log rows; verify no rows are modified.

1. Test Pyramid​

2. Unit Tests​

2.1 Rule Evaluator Tests (per rule type)​

KEYWORD rule​

REGEX rule​

AI_CLASSIFICATION rule​

COMPOSITE rule​

RATE_VOLUME rule​

DLR_ABUSE rule​

GEO_RESTRICTION, TEMPORAL, SENDER_ID, RECIPIENT rules​

2.2 Evaluation Engine Tests​

2.3 Tenant Scoring Tests​

2.4 Property-Based Tests (fast-check)​

3. Integration Tests​

3.1 gRPC Handler Integration​

3.2 Hold Queue Integration​

3.3 Tenant Scoring Integration​

4. Contract Tests​

5. Load Tests​

5.1 Evaluation throughput​

5.2 AI classification under load​

6. Security Tests​