SMS Firewall Service — Jira-Ready Epics & User Stories
Status: populated Owner: Trust & Safety Last updated: 2026-04-20 Service prefix: FW Scope: National perimeter firewall — inbound MO + transit MT, AIT detection, SIM-box exclusion, grey-route exclusion, DND enforcement, regulator + cross-MNO federation, admin REST + audit log. Source of truth:
_sources/sms-firewall-service/user_stories.md
Epic Summary
| Epic ID | Title | Stories | Points |
|---|---|---|---|
| EP-FW-01 | Inbound MO Firewall (origin, content, rate, geo) | US-FW-001 – US-FW-006 | 36 |
| EP-FW-02 | Transit MT Firewall (peer hygiene, grey-route exclusion) | US-FW-007 – US-FW-011 | 31 |
| EP-FW-03 | National Blocklist Federation (regulator + cross-MNO) | US-FW-012 – US-FW-015 | 18 |
| EP-FW-04 | Firewall Admin REST + Audit Log | US-FW-016 – US-FW-019 | 21 |
| Total | 19 stories | 106 |
EP-FW-01 · Inbound MO Firewall
Context: Per ADR-0004 §3, every inbound
deliver_smPDU from any MNO bind must be evaluated by the firewall before any downstream service (consent-ledger, fraud-intel, routing-engine) sees it. Verdict latency budget is 30 ms P95.
US-FW-001 · Synchronous Inbound MO Verdict gRPC
Type: Feature | Points: 8
Description:
As smpp-connector-{mno}-rx, I need to call sms-firewall-service.FilterInbound(MoContext) over mTLS gRPC and receive a verdict (ALLOW | FLAG | BLOCK | QUARANTINE) before responding to the MNO with deliver_sm_resp.
Acceptance Criteria:
-
FilterInboundreturns within P95 ≤ 30 ms on port50061 -
Verdictenum:ALLOW | FLAG | BLOCK | QUARANTINE;BLOCKincludesblockReason(ORIGIN_BLOCKLIST | CONTENT_FORBIDDEN | RATE_EXCEEDED | GEO_FORBIDDEN | DND_PRESENT | AIT_SIGNATURE | SIMBOX_SIGNATURE | REGULATOR_BLOCK) andruleHit.ruleId -
QUARANTINEreturnsholdId(UUID v4) and insertsfirewall.quarantine_queuerow withexpires_at = now() + 24h - Caller SVID validated against
spiffe://ghasi/np-data/smpp-connector-*allow-list; non-matching →PERMISSION_DENIED -
MAINTENANCEmode returnsALLOW + flags=["MAINTENANCE_MODE"]and audit-logs the bypass - Integration test: 1000 calls at 200 RPS → P95 ≤ 30 ms with full rule pipeline active
US-FW-002 · Per-Source Sliding-Window Rate Governor
Type: Feature | Points: 5
Description:
As the national perimeter, I need a sliding-window rate limit per srcMsisdn, per source aggregator, and per dstMsisdn so that a single compromised source cannot flood the platform.
Acceptance Criteria:
- Default thresholds: 10/1s, 100/1m, 500/1h per
srcMsisdn; configurable per scope infirewall.rate_overrides - Implementation: Redis sorted set
fw:rate:src-msisdn:{e164}:{window}withZADD+ZREMRANGEBYSCOREfor true sliding behaviour - Threshold breach →
BLOCKwithblockReason = RATE_EXCEEDED - Redis unavailable → verdict
ALLOW + flag=RATE_GOVERNOR_DEGRADED; metricfirewall_rate_governor_skip_total - Tenant-allowlisted short-codes use elevated thresholds from override table
- Unit test: 1000 events at 100 ms intervals; threshold 10/1s → first 10 ALLOW, remainder BLOCK
US-FW-003 · National Blocklist Bloom-Filter Lookup
Type: Feature | Points: 5
Description:
As the firewall evaluator, I need a Redis Bloom filter (BF.EXISTS fw:blocklist:national) for srcMsisdn and senderId lookups, with a Postgres fallthrough for definitive confirmation.
Acceptance Criteria:
- Bloom filter capacity 10M, target false-positive rate 0.01
-
BF.EXISTS = 0→ not blocked, no Postgres read -
BF.EXISTS = 1→SELECT 1 FROM firewall.blocklist_entries WHERE entry = $1 AND active = TRUEfor definitive verdict - BF read latency P99 ≤ 0.5 ms
- Hot reload:
firewall.blocklist.changed.v1event → all replicas refresh within 5 s - Redis unavailable → direct Postgres query +
firewall_bloom_unavailable_totalincrement
US-FW-004 · Content-Class Rule Evaluation (CEL Expressions)
Type: Feature | Points: 8
Description:
As a Trust & Safety lead, I need to author content-class rules using sandboxed CEL-style expressions referencing typed inputs (pdu.body, pdu.coding, src.msisdn, mno.id, peer.asn, consent.dndPresent).
Acceptance Criteria:
- CEL grammar with allowed functions:
matches(re),contains(s),startsWith(s),endsWith(s),len(), comparison + boolean operators - Typed inputs validated at admission;
pdu.foo→ HTTP 400RULE_INVALID_INPUT_REF - Unsafe expressions (
os.system, file IO, network) → HTTP 422RULE_UNSAFE_EXPRESSION - Per-rule eval timeout 50 ms; exceeded → auto-disable and
firewall.rule.degraded.v1withreason = REGEX_TIMEOUT - Rule publish →
firewall.rule.changed.v1→ all replicas hot-reload within 5 s - Match fires
BLOCK | QUARANTINE | FLAGperaction;ruleHit.ruleIdpopulated
US-FW-005 · Geo / Origin Mismatch Detection
Type: Feature | Points: 5
Description:
As the national perimeter, I need to reject inbound MO PDUs whose srcMsisdn country code does not match the permitted MCC/MNC for the originating MNO bind.
Acceptance Criteria:
-
firewall.mno_bind_registryrow withpermittedCountryCodesJSONB array consulted per call - Mismatch (
+1...overpermittedCountryCodes = ['+93']) →BLOCKwithblockReason = GEO_FORBIDDEN -
number-intelligence-service.Lookupcross-check used when available;UNAVAILABLE→ fall back to MCC/MNC table +firewall_numint_unavailable_total - Per-bind override via
PATCH /v1/admin/firewall/mno-binds/{id}audit-logged - Integration test: spoofed +1 source over AWCC bind → BLOCK + correct audit row
US-FW-006 · DND Snapshot Materialisation from Consent Ledger
Type: Feature | Points: 5
Description:
As the firewall data-plane, I need to consume the hourly consent.dnd.snapshot.v1 event from consent-ledger-service and materialise the national DND list into Redis Bloom + Postgres firewall.dnd_snapshot.
Acceptance Criteria:
- Event payload includes
snapshotUrl(MinIO presigned, 24 h validity),entryCount,snapshotSha256 - Snapshot rebuild completes within 60 s of event ACK; failure → previous snapshot remains active + staleness metric
- DND check:
BF.EXISTS fw:dnd:bloomthen Postgresfirewall.dnd_snapshotfor definitive - Snapshot age > 6 h →
firewall.alert.dnd.snapshot.stale.v1 - Bloom auto-resize when
entryCount > current_capacity × 0.8
EP-FW-02 · Transit MT Firewall
Context: Inbound
submit_smfrom peer aggregators must be evaluated for ASN hygiene, sender-ID spoofing, and grey-route signature before reachingrouting-engine.
US-FW-007 · EvaluateTransit gRPC for Peer-Aggregator MT
Type: Feature | Points: 8
Description:
As smpp-connector-transit-rx, I need to call sms-firewall-service.EvaluateTransit(TransitMtContext) for every inbound submit_sm from a peer aggregator.
Acceptance Criteria:
- gRPC
EvaluateTransitreturns within P95 ≤ 50 ms - Inputs:
peerAsn,peerSystemId,srcAddr,dstMsisdn,senderId,pduBody,pduTon,pduNpi - Unknown
peerAsn→BLOCKwithblockReason = PEER_ASN_UNKNOWN - Sender-ID not in peer's allowlist →
BLOCKwithblockReason = SENDER_ID_SPOOFED - BLOCK verdict → connector returns
submit_sm_respcommand_status = ESME_RSUBMITFAIL - Firewall unavailable → connector fail-closes (
ESME_RSUBMITFAIL+firewall.transit.unavailable.v1)
US-FW-008 · Grey-Route Signature Detection (HLR + ASN Heuristic)
Type: Feature | Points: 8
Description:
As the national perimeter, I need to detect grey-route arbitrage by cross-checking dstMsisdn HLR/MNP resolution against the peer's ASN.
Acceptance Criteria:
-
dstMsisdnresolves vianumber-intelligence-service.LookuptohomeMnoId; mismatch with peer's registeredpeer_mno_routes→BLOCKwithblockReason = GREY_ROUTE - Heuristic: peer with > 30% MT to non-peered MNO in last 1000 submissions →
firewall.alert.greyroute.heuristic.v1(warning only) - Consume
fraud.detected.greyroute.v1: implicated peer added tofirewall.peer_quarantine; subsequent MT auto-QUARANTINE - Heuristic disable via admin override →
firewall.audit.policy.changed.v1with operator identity - Integration test: peer ASN 64500 attempts MT to AWCC subscriber → BLOCK with
blockReason = GREY_ROUTE
US-FW-009 · Peer-Aggregator Hygiene Score
Type: Feature | Points: 5
Description: As a Trust & Safety analyst, I need a rolling 24 h hygiene score per peer aggregator combining sender-ID conformance, grey-route hits, and content violations.
Acceptance Criteria:
- 5-minute rollup writes
firewall.peer_hygiene_scores(peer_id,score 0-100,windowStart,windowEnd,sampleCount) -
score < 60for 3 consecutive windows →firewall.alert.peer.degraded.v1 -
score < 30→ auto-QUARANTINEpeer;firewall.peer_quarantine.entered.v1 - Auto-quarantine release requires dual-approval
POST /v1/admin/firewall/peers/{id}/release - Score formula documented and reproducible from raw audit log
US-FW-010 · Sender-ID Origin Verification on Transit MT
Type: Feature | Points: 5
Description:
As the national perimeter, I need to verify each transit MT senderId against sender-id-registry-service for ownership and status.
Acceptance Criteria:
- gRPC
sender-id-registry-service.Verify(senderId, peerId)called per transit MT - Ownership mismatch →
BLOCKwithblockReason = SENDER_ID_SPOOFED - Suspended sender-ID →
BLOCKwithblockReason = SENDER_ID_SUSPENDED - Unknown sender-ID (
status = UNKNOWN) →QUARANTINE(NOC review) - Registry unavailable → fall back to local hourly cache
firewall.peer_senderid_allowlist
US-FW-011 · Quarantine Queue with NOC Manual Release
Type: Feature | Points: 5
Description: As an NOC operator, I need to view the quarantine queue, inspect quarantined PDUs, and either release (re-evaluate) or permanently reject.
Acceptance Criteria:
-
GET /v1/admin/firewall/quarantine?status=PENDING&page=1&pageSize=50→ paginated list -
GET /v1/admin/firewall/quarantine/{holdId}→ fullMoContextorTransitMtContext(PDU body redacted in logs) -
POST /v1/admin/firewall/quarantine/{holdId}/release→ re-evaluate; if ALLOW, re-inject viafirewall.quarantine.released.v1 -
POST /v1/admin/firewall/quarantine/{holdId}/reject {reason}→firewall.quarantine.rejected.v1; 7-year cold retention - Auto-expiry at
expires_at→firewall.quarantine.expired.v1
EP-FW-03 · National Blocklist Federation
Context: Per ADR-0004, the platform must consume regulator-issued blocklist updates and publish a daily signed diff back to peer Afghan MNOs.
US-FW-012 · National Blocklist Federation Import
Type: Feature | Points: 5
Description:
As the national perimeter, I need to consume regulator.blocklist.published.v1 events from regulator-portal-service and import the regulator's mandated additions/removals.
Acceptance Criteria:
- Event payload:
entries: [{ type: 'MSISDN'|'SENDER_ID'|'KEYWORD', value, action: 'ADD'|'REMOVE', issuedBy, issuedAt, regulatorRef }] - Upsert into
firewall.blocklist_entrieswithsource = 'REGULATOR',regulator_ref -
action = REMOVE→active = FALSE(soft delete; audit retained) - Idempotent on
(source, regulator_ref, type, value)unique constraint - HSM signature validation; invalid →
firewall.alert.federation.signature.invalid.v1(PagerDuty) - Post-import: rebuild Bloom + emit
firewall.blocklist.federated.v1with counts
US-FW-013 · Cross-MNO Blocklist Federation Export
Type: Feature | Points: 5
Description:
As carrier relations, I need a daily HSM-signed diff of firewall.blocklist_entries (with share_with_peers = TRUE) published to peer Afghan MNOs.
Acceptance Criteria:
- Cron 02:00 Asia/Kabul; output
firewall-federation-out/{yyyymmdd}.jsonl.sig(JSON Lines, HSM PKCS#11 signature) -
firewall.federation.exported.v1published with SHA-256, signature, presigned URL (24 h) - Mirror to regulator-mediated SFTP within 5 minutes of upload
- Heartbeat
firewall.federation.heartbeat.v1even on zero-diff days - Integration test: signed file verifies against published HSM public key
US-FW-014 · Federated Entry Reputation and Confidence Scoring
Type: Feature | Points: 5
Description: As a Trust & Safety lead, I need per-entry source attribution and a confidence score so that single-source entries enter probation while multi-source entries auto-apply.
Acceptance Criteria:
-
firewall.blocklist_entries.sourcesJSONB:[{ sourceId, sourceType, reportedAt }] -
confidence_score = clamp(regulator_count*1.0 + peer_count*0.5 + internal_count*0.7, 0, 1) -
score >= 0.8→auto_apply = TRUE -
score < 0.8→ 24 h probation: matches →QUARANTINEnotBLOCK -
score < 0.4→ auto-deactivate +firewall.blocklist.entry.deactivated.v1
US-FW-015 · Per-Entry Audit Trail for Blocklist Changes
Type: Feature | Points: 3
Description: As a regulator auditor, I need the full chain of additions, removals, source attributions, and operator decisions for any blocklist entry.
Acceptance Criteria:
-
GET /v1/admin/firewall/blocklist/{entryId}/history→ chronological events:created,source_added,source_removed,confidence_changed,manually_overridden,deactivated,reactivated - Each event:
actor(operator ID /SYSTEM/ regulator ref),timestamp(UTC µs),reason -
GET /v1/internal/firewall/blocklist/export?since={iso8601}→ JSON Lines + HSM signature (regulator-only mTLS) -
firewall.blocklist_auditappend-only enforced by Postgres trigger blockingUPDATE/DELETE
EP-FW-04 · Firewall Admin REST + Audit Log
US-FW-016 · Admin REST: Rule CRUD with Versioning
Type: Feature | Points: 5
Description: As a Trust & Safety admin, I need to create, list, fetch, update, deactivate, and version firewall rules via authenticated REST.
Acceptance Criteria:
-
POST /v1/admin/firewall/rules(roletns-admin) → 201 withruleId,version: 1 -
PUT /v1/admin/firewall/rules/{ruleId}→ new immutable version row;versionbumped -
GET /v1/admin/firewall/rules?scope=MO&enabled=true→ paginated -
enabled = false→ excluded from runtime - Non-admin token → 403
- Pact contract test against
admin-dashboardconsumer
US-FW-017 · Admin REST: MNO Bind Registry
Type: Feature | Points: 3
Description:
As carrier relations, I need to register mnoBindId ↔ mnoId ↔ permitted-country-codes ↔ permitted-sender-IDs mappings.
Acceptance Criteria:
-
POST /v1/admin/firewall/mno-bindswith{ mnoBindId, mnoId, direction, permittedCountryCodes[], permittedSenderIds[], notes }→ 201 -
GET /v1/admin/firewall/mno-binds→ list (no secrets) -
DELETE /v1/admin/firewall/mno-binds/{id}→ soft delete +firewall.mno_bind.deactivated.v1 -
POST /v1/internal/firewall/mno-binds/{id}/heartbeatfrom connector pods; missing > 60 s →firewall.alert.bind.missing.v1
US-FW-018 · Append-Only Audit Log to NATS + Postgres + Cold Archive
Type: Feature | Points: 8
Description:
As the regulator-grade evidence pipeline, I need every verdict to produce an append-only firewall.audit.v1 event mirrored to Postgres and to MinIO WORM cold archive.
Acceptance Criteria:
- NATS JetStream stream
FIREWALL_AUDITreceives every verdict event withtraceId,verdict,verdictAt,evaluatedRuleIds[],srcMsisdn,dstMsisdn,mnoBindId,peerAsn?,holdId? - Postgres
firewall.audit_logpartitionedPARTITION BY RANGE (verdict_at)monthly - Daily archive job 03:00 Asia/Kabul exports yesterday's partition to MinIO
firewall-audit-archive/{yyyymmdd}.parquet.zst.sig(HSM-signed, Object Lock Compliance, 7-year retention) -
FIREWALL_AUDITmirrored tomzrand todxbleaf - Postgres trigger blocks
UPDATE/DELETEonfirewall.audit_log
US-FW-019 · Operating-Mode Switch (NORMAL ↔ DEGRADED ↔ PANIC ↔ MAINTENANCE)
Type: Feature | Points: 5
Description: As NOC + Trust & Safety lead, I need to switch operating mode via dual-approval REST with full audit and auto-trip on latency breach.
Acceptance Criteria:
-
POST /v1/admin/firewall/mode { targetMode, reason, secondApproverToken }requires two distinct admin tokens within 60 s - Single approver → 412
DUAL_APPROVAL_REQUIRED -
targetMode = PANIC→ disablestype IN ('REGEX','CLASSIFIER')rules at runtime;firewall_mode_panic_active = 1 - Auto-trip:
firewall_rule_eval_seconds{quantile="0.95"} > 100msfor 60 s → auto-PANIC +firewall.alert.mode.auto_panic.v1(PagerDuty) - Auto-recovery: < 30 ms P95 sustained 5 m → auto-restore to NORMAL + event