SMS Firewall Service — Testing Strategy

Version: 1.0 Status: Draft Owner: Trust & Safety + QA Last Updated: 2026-04-21 Companion: SERVICE_OVERVIEW · FAILURE_MODES · SERVICE_READINESS

1. Test pyramid & coverage targets

                ┌──────────────────────────┐
                │   E2E (5–8 flows)        │  Playwright/k6 + ephemeral cluster
                ├──────────────────────────┤
                │  Performance (5+)         │  k6 / ghz; on every PR + nightly
                ├──────────────────────────┤
                │  Contract (10+)           │  Pact provider + JSON-Schema
                ├──────────────────────────┤
                │  Integration (60+)        │  Real PG + Redis + NATS via testcontainers
                ├──────────────────────────┤
                │   Unit (300+)             │  Per use-case, per rule type, per pipeline stage
                └──────────────────────────┘

Layer	Target
Unit	≥ 90% line, ≥ 85% branch overall; ≥ 95% branch on rule-evaluator
Integration	≥ 80% of HTTP and gRPC handlers exercised end-to-end against real infra
Contract	100% of event subjects + every gRPC RPC consumer × producer pair
E2E	5–8 critical user flows (FW-US-001, 007, 011, 012, 018, 019)
Performance	P95/P99 SLO assertions per CI run

CI gates:

PRs touching src/domain/** or src/application/** block on coverage regression > 1%.
PRs touching src/domain/rule-evaluator/** additionally block on firewall_filter_inbound_p95 > 30 ms in the regression load test.

2. Unit tests (Jest + ts-jest)

2.1 Per rule type

Each rule type has a dedicated test file in src/domain/rule-evaluator/__tests__/:

describe('OriginBlocklistEvaluator', () => {
  it('returns BLOCK with ORIGIN_BLOCKLIST when srcMsisdn matches active entry', async () => {
    const entry = buildBlocklistEntry({ type: 'MSISDN', value: '+93701234567' });
    const ctx = buildMoContext({ srcMsisdn: '+93701234567' });
    const result = await evaluator.evaluate(ctx, [entry]);
    expect(result.verdict).toBe('BLOCK');
    expect(result.blockReason).toBe('ORIGIN_BLOCKLIST');
  });

  it('does not match when entry is inactive', async () => { ... });
  it('matches MSISDN_RANGE prefix entries', async () => { ... });
  it('matches PEER_ASN entries against TransitMtContext', async () => { ... });
});

describe('RegexRuleEvaluator', () => {
  it('compiles against re2 engine', async () => { ... });
  it('rejects pattern > 500 chars at admission', async () => { ... });
  it('does not hang on known ReDoS patterns (terminates within 50ms)', async () => { ... });
  it('auto-disables rule on per-call timeout', async () => { ... });
});

describe('RateVolumeEvaluator', () => {
  it('triggers when ZADD count exceeds threshold (10/1s)', async () => {
    const redis = await testRedis();
    for (let i = 0; i < 10; i++) await evaluator.recordEvent(redis, '+93701234567', 1000+i);
    const v1 = await evaluator.evaluate(redis, '+93701234567', 1010);   // 11th in 1s
    expect(v1.verdict).toBe('BLOCK');
    expect(v1.blockReason).toBe('RATE_EXCEEDED');
  });

  it('uses elevated threshold for rate_overrides entries', async () => { ... });
  it('emits RATE_GOVERNOR_DEGRADED flag when Redis throws', async () => { ... });
});

describe('GeoRestrictionEvaluator', () => {
  it('BLOCKs +1 src over awcc-rx-01 with permittedCountryCodes=[+93]', async () => { ... });
  it('passes +93 src over awcc-rx-01', async () => { ... });
  it('flags NUMINT_UNAVAILABLE on number-intel UNAVAILABLE', async () => { ... });
});

describe('GreyRouteEvaluator', () => {
  it('BLOCKs transit MT to AWCC subscriber from peer ASN 64500 not in peer_mno_routes',
     async () => { ... });
  it('emits firewall.alert.greyroute.heuristic.v1 when peer >30% non-peered', async () => { ... });
});

describe('SenderIdVerifyEvaluator', () => {
  it('BLOCKs when sender-id-registry returns OWNERSHIP_MISMATCH', async () => { ... });
  it('QUARANTINEs when sender-id-registry returns UNKNOWN', async () => { ... });
  it('falls back to local cache when registry UNAVAILABLE', async () => { ... });
});

describe('CompositeEvaluator', () => {
  it('AND: matches only when all children match', async () => { ... });
  it('detects cycles at admission', async () => { ... });
  it('depth limit 4 enforced at runtime', async () => { ... });
});

describe('ClassifierEvaluator', () => {
  it('caps standalone classifier at FLAG (per moderation policy)', async () => { ... });
  it('returns FIREWALL_FAIRNESS_BIAS_DETECTED on per-MNO-block-rate stdev > 0.5', async () => { ... });
  it('respects shadowMode=true (computes but does not affect verdict)', async () => { ... });
});

2.2 Verdict pipeline (orchestration)

describe('FirewallEvaluationPipeline', () => {
  it('returns ALLOW immediately when matching ALLOW rule found (whitelist short-circuit)', async () => { ... });
  it('returns BLOCK on first matching BLOCK rule in priority order', async () => { ... });
  it('returns QUARANTINE when no BLOCK but QUARANTINE rule matches', async () => { ... });
  it('returns FLAG when only FLAG rules match', async () => { ... });
  it('serves cached verdict on repeat fingerprint within TTL', async () => { ... });
  it('skips classifier rules in PANIC mode', async () => { ... });
  it('respects maintenance-mode short-circuit (returns ALLOW + MAINTENANCE_MODE flag)', async () => { ... });
  it('writes audit row with hash-chained prev_hash/row_hash', async () => { ... });
  it('encrypts PDU body in quarantine_queue under per-MNO KEK', async () => { ... });
});

2.3 Hash-chain integrity

describe('AuditHashChain', () => {
  it('first row in partition uses zero prev_hash (genesis)', async () => { ... });
  it('subsequent row prev_hash equals previous row_hash', async () => { ... });
  it('detects tampering when row_hash != recomputed', async () => { ... });
  it('verifier emits chain.break.v1 on synthetic break injection', async () => { ... });
});

2.4 Property-based (fast-check)

// Verdict determinism: same MoContext + same rule-set version always same verdict
fc.assert(fc.asyncProperty(arbMoContext(), arbRuleSet(), async (ctx, rules) => {
  const v1 = await pipeline.evaluate(ctx, rules);
  const v2 = await pipeline.evaluate(ctx, rules);
  return v1.verdict === v2.verdict && v1.blockReason === v2.blockReason;
}));

// ALLOW rules always override BLOCK rules
fc.assert(fc.asyncProperty(
  arbRuleSetWithAllowRule(),
  arbMatchingMoContext(),
  async (rules, ctx) => {
    const v = await pipeline.evaluate(ctx, rules);
    return v.verdict === 'ALLOW';
  }
));

// Hash chain forms valid chain
fc.assert(fc.asyncProperty(arbAuditRows(), async (rows) => {
  const inserted = await Promise.all(rows.map(r => audit.insert(r)));
  return verifyChain(inserted);
}));

3. Integration tests (Jest + testcontainers)

Spin up real Postgres, Redis, NATS via testcontainers; mock-fraud-feed and mock-mno-blocklist for federation tests.

3.1 gRPC handler integration

describe('FilterInbound gRPC — integration', () => {
  let grpcClient: SmsFirewallClient;

  beforeAll(async () => {
    db = await startTestPostgres();
    redis = await startTestRedis();
    nats = await startTestNats();
    await runMigrations(db);
    await seedDefaultRulesAndBlocklists(db);
    server = await startFirewallService({ db, redis, nats });
    grpcClient = createMtlsGrpcClient('localhost:50061', testCerts);
  });

  it('returns ALLOW for clean MO PDU against default rule set', async () => {
    const v = await grpcClient.filterInbound(buildValidMoRequest());
    expect(v.verdict).toBe('ALLOW');
  });

  it('returns BLOCK + ORIGIN_BLOCKLIST when srcMsisdn in blocklist', async () => {
    await seedBlocklistEntry(db, { type: 'MSISDN', value: '+93701234567' });
    await rebuildBloom(redis);
    const v = await grpcClient.filterInbound(buildMoRequest({ srcMsisdn: '+93701234567' }));
    expect(v.verdict).toBe('BLOCK');
    expect(v.blockReason).toBe('ORIGIN_BLOCKLIST');
  });

  it('inserts firewall.audit row with valid hash-chain', async () => { ... });
  it('writes outbox row + emits firewall.audit.v1 to NATS', async () => { ... });
  it('rejects caller with non-allowlisted SVID with PERMISSION_DENIED', async () => { ... });
  it('returns QUARANTINE + holdId; encrypts PDU under per-MNO KEK', async () => { ... });
});

3.2 Federation round-trip

describe('Federation — integration', () => {
  it('imports HSM-signed regulator.blocklist.published.v1 idempotently', async () => { ... });
  it('rejects invalid HSM signature; emits federation.signature.invalid.v1', async () => { ... });
  it('exports daily diff signed with PKCS#11 HSM key', async () => {
    await seedFederationCandidates(db, 100);
    await federationExportWorker.runOnce();
    const exportedFile = await minio.getObject('firewall-federation-out', '20260421.jsonl.sig');
    expect(verifyHsmSignature(exportedFile)).toBe(true);
  });
  it('emits firewall.federation.heartbeat.v1 even on zero-diff days', async () => { ... });
});

3.3 Quarantine lifecycle

describe('Quarantine lifecycle — integration', () => {
  it('NOC release re-injects PDU via firewall.quarantine.released.v1', async () => { ... });
  it('NOC reject is terminal; no re-injection', async () => { ... });
  it('auto-expiry marks status AUTO_EXPIRED at expires_at', async () => { ... });
  it('release requires dual approver; single approver returns 412', async () => { ... });
});

3.4 NATS consumer integration

describe('NATS consumers — integration', () => {
  it('consent.dnd.snapshot.v1 → DND projection rebuilt within 60s', async () => { ... });
  it('fraud.detected.simbox.v1 → simbox_signals upsert + Redis cache populated', async () => { ... });
  it('regulator.blocklist.published.v1 with bad signature → no state mutation + alert emitted',
     async () => { ... });
});

4. Contract tests (Pact + JSON-Schema)

4.1 gRPC consumer contracts

Consumer	Provider	Pact file
`smpp-connector-awcc-rx`	`sms-firewall-service`	`pacts/connector-awcc-rx_filter-inbound.json`
`smpp-connector-transit-rx`	`sms-firewall-service`	`pacts/connector-transit-rx_evaluate-transit.json`
`routing-engine`	`sms-firewall-service`	`pacts/routing-engine_check-egress.json`
`cdr-mediation-service`	`sms-firewall-service`	`pacts/cdr_get-verdict.json`
`admin-dashboard`	`sms-firewall-service`	`pacts/admin-dashboard_admin-rest.json`

Pact verifications run in CI on every PR.

4.2 Event-schema conformance

JSON-Schema files for each firewall.*.v1 subject live in proto/firewall/v1/events/. CI verifies:

Every produced event validates against its registered schema (via Apicurio registry CI hook).
Backward compatibility check: firewall.audit.v1 schema additions must be optional non-required fields.

5. E2E tests (Playwright + ephemeral cluster)

Spin up a kind/k3s cluster with: firewall + smpp-connector mock + Postgres + Redis + NATS + mock fraud-intel + mock regulator + mock sender-id-registry.

Flow	User story	Assertion
Inbound MO ALLOW pipeline	FW-US-001	Mock MNO sends `deliver_sm`; smpp-connector calls firewall; verdict = ALLOW; PDU forwarded to channel-router-service
Inbound MO BLOCK on blocklist	FW-US-001, FW-US-003	Same as above with blocklisted srcMsisdn; smpp-connector silently drops; alert event emitted
Transit MT BLOCK on grey-route	FW-US-008	Mock peer sends submit_sm targeting AWCC subscriber from non-peer ASN; verdict = BLOCK; submit_sm_resp = ESME_RSUBMITFAIL
Quarantine release end-to-end	FW-US-011	Suspicious PDU → QUARANTINE; NOC reviews via REST + dual-approver release; PDU re-injected; second eval skips firewall (skipFirewall flag)
Federation import → BLOCK applied	FW-US-012	Mock regulator publishes signed blocklist event; firewall imports; subsequent matching MO is BLOCKED
Daily federation export	FW-US-013	Trigger cron; verify file uploaded to MinIO with valid HSM signature
Auto-PANIC + auto-recover	FW-US-019	Inject artificial 200ms latency in classifier; firewall auto-trips to PANIC within 60s; remove latency; auto-recovers within 5min
Hash-chain integrity	FW-US-018	Insert 10 000 audit rows; run AuditVerifierWorker; verify chain unbroken; tamper one row directly via SQL → verifier reports break

6. Performance / load tests

6.1 Hot-path throughput (`ghz`)

ghz --proto proto/firewall/v1/firewall.proto \
    --call ghasi.sms.firewall.v1.SmsFirewallService.FilterInbound \
    --data-file ./test/load/sample_filter_inbound.json \
    --concurrency 100 \
    --rps 1000 \
    --duration 10m \
    --cert ./test/certs/connector.pem --key ./test/certs/connector-key.pem \
    --cacert ./test/certs/ca.pem \
    localhost:50061

Pass criteria:

P50 ≤ 10 ms, P95 ≤ 30 ms, P99 ≤ 50 ms
Error rate < 0.01% (target 99.99%)
No memory leak over 10-min run (RSS stable within 10%)
No GC pause > 50 ms

Runs in CI on every PR touching src/domain/**, src/application/**, or src/infrastructure/grpc/**.

6.2 Transit MT load

EvaluateTransit at 200 RPS for 5 min; same pass criteria with P95 ≤ 50 ms.

6.3 Federation import volume

Inject 100 000 entries via mock regulator event; assert import completes < 30 s; Bloom rebuild < 60 s; emitted event matches counts.

6.4 Quarantine review concurrency

50 concurrent NOC reviewers running release/reject; assert no row-lock starvation; dual-approval correctness preserved.

7. Chaos / failure-injection tests

Run nightly via Litmus / Chaos Mesh on staging.

Chaos	Expected behaviour
Postgres primary down (failover to replica)	Service continues on Redis cache for 60 s; new evaluations after cache miss → INTERNAL → connectors fail-closed (MO WAL, transit RSUBMITFAIL)
Redis cluster down	Bloom + rate governor unavailable; fall through to PG; latency P95 → 60 ms; flags emitted; no data loss
NATS down	Outbox queues locally; resumes on reconnect; alert at backlog > 10 000
Local LLM down	CLASSIFIER rules skip; verdicts proceed; flag `CLASSIFIER_UNAVAILABLE`
Vault Transit down	New QUARANTINE → upgrade to BLOCK with `KEK_UNAVAILABLE` flag
HSM down	Federation export postponed; emit `firewall.federation.export.postponed.v1`
Network partition kbl↔mzr	Both regions continue locally; control-plane writes deferred; replication catches up on heal
Bloom filter rebuild fails mid-run	Previous Bloom retained; alert; manual re-trigger
Catastrophic regex injected via admin REST	ReDoS screen at admission rejects; if it slips through, per-call timeout fires + auto-disable

8. Security tests

Rule-expression sandbox escape: feed 100 known CEL injection payloads; verify all rejected with RULE_UNSAFE_EXPRESSION.
ReDoS pattern admission: feed 50 known ReDoS regexes; verify all rejected.
mTLS bypass: attempt connection without client cert → UNAUTHENTICATED; with mismatched SPIFFE ID → PERMISSION_DENIED.
JWT bypass: invalid signature, expired, wrong audience → all 401 at Kong.
Role escalation: noc cannot call POST /v1/admin/firewall/rules (403); regulator-auditor cannot access /v1/admin/firewall/quarantine/{id} body (403/redacted).
Cross-region data exfiltration: verify mzr replica is read-only; attempted writes → permission denied; verify dxb leaf cannot decrypt audit archive without Afghan-held key.
PII leak scanner: static-analyse all logger.* call sites for pduBody, raw srcMsisdn/dstMsisdn parameters (ESLint rule no-pdu-body-in-logs).
Audit log tamper-evidence: insert tampered row → AuditVerifierWorker reports chain.break within 1h.
Federation signature tampering: corrupt 1 byte of regulator event → service rejects + alert.
Hash-chain replay protection: verify replaying a 6-month-old audit row into current partition fails (prev_hash mismatch).

9. Test data

test/fixtures/mo-contexts.json: 100 representative MoContext samples (ALLOW, FLAG, BLOCK, QUARANTINE)
test/fixtures/transit-mt-contexts.json: 100 representative TransitMtContext samples
test/fixtures/blocklist-entries.jsonl: 10 000-entry blocklist for federation testing
test/fixtures/regulator-event.signed.json: HSM-signed regulator event using a test PKCS#11 token
test/fixtures/redos-patterns.txt: 50 known ReDoS patterns for security tests
test/fixtures/cel-injection-payloads.txt: 100 known sandbox-escape attempts

10. CI pipeline

# .github/workflows/firewall.yml
jobs:
  unit:        runs Jest unit tests + coverage gate (>= 90% line, >= 95% branch on rule-evaluator)
  lint:        eslint + prettier + tsc --noEmit
  contract:    Pact provider verification against published consumer contracts
  integration: testcontainers-backed; PG + Redis + NATS up; full handler coverage
  load:        ghz against deployed test instance; assert P95 ≤ 30 ms
  security:    gitleaks + trivy + osv-scanner + redos-screen + cel-injection-screen
  chaos:       (nightly only) Litmus on staging cluster
  e2e:         (nightly only) Playwright on ephemeral kind cluster

A PR cannot merge without unit + lint + contract + integration + load + security all green.

1. Test pyramid & coverage targets​

2. Unit tests (Jest + ts-jest)​

2.1 Per rule type​

2.2 Verdict pipeline (orchestration)​

2.3 Hash-chain integrity​

2.4 Property-based (fast-check)​

3. Integration tests (Jest + testcontainers)​

3.1 gRPC handler integration​

3.2 Federation round-trip​

3.3 Quarantine lifecycle​

3.4 NATS consumer integration​

4. Contract tests (Pact + JSON-Schema)​

4.1 gRPC consumer contracts​

4.2 Event-schema conformance​

5. E2E tests (Playwright + ephemeral cluster)​

6. Performance / load tests​

6.1 Hot-path throughput (ghz)​

6.2 Transit MT load​

6.3 Federation import volume​

6.4 Quarantine review concurrency​

7. Chaos / failure-injection tests​

8. Security tests​

9. Test data​

10. CI pipeline​