Skip to main content

Number Intelligence Service — Jira-Ready Epics & User Stories

Status: populated Owner: Messaging Core Last updated: 2026-04-20 Service prefix: NI Scope: HLR/HSS lookup with layered cache, MNP registry and daily MNO reconciliation, EIR/CEIR cross-check, public tenant-callable Lookup API. Derived from docs/07-epics-and-user-stories.md §6.2 and ADR-0004 §3.


Epic Summary

Epic IDTitleStoriesPoints
EP-NI-01HLR/HSS Lookup with Cache (MSISDN → MNO, line-type, country)US-NI-001 – US-NI-00632
EP-NI-02MNP (Mobile Number Portability) Registry & Daily Reconciliation with MNOsUS-NI-007 – US-NI-01225
EP-NI-03EIR/CEIR Cross-Check (IMEI, stolen-device exclusion)US-NI-013 – US-NI-01616
EP-NI-04Public Lookup API (tenant-callable, billable, cached)US-NI-017 – US-NI-02119
Total21 stories92

EP-NI-01 · HLR/HSS Lookup with Cache

Context: Authoritative resolution of MSISDN → (MNO, line_type, country, is_ported, VLR) via a four-tier cascade (in-process LRU → Redis → Postgres → live MAP/SS7 or REST HLR). Consumed in the hot path by routing-engine, sms-firewall-service, channel-router-service, and fraud-intel-service.

US-NI-001 · Resolve MSISDN via cache cascade

Type: Feature | Points: 8

Description: As routing-engine, I need ResolveMsisdn(e164, opts) to return typed attribution in P95 ≤ 15 ms so I can pick the correct per-MNO SMPP connector without a paid SS7 lookup on every call.

Acceptance Criteria:

  • LRU hit returns in P95 ≤ 1 ms with source = "lru"
  • Redis hit returns in P95 ≤ 4 ms with source = "redis", populates LRU
  • Postgres hit returns in P95 ≤ 15 ms with source = "postgres", populates Redis + LRU
  • Full miss triggers live HLR through ni-hlr-gateway; result written to all tiers; source = "live_hlr"
  • Live-HLR failure/timeout returns persisted answer with confidence = "low" rather than erroring
  • Malformed E.164 (regex ^\+[1-9]\d{6,14}$) returns gRPC INVALID_ARGUMENT

US-NI-002 · Batch MSISDN resolution

Type: Feature | Points: 5

Description: As sms-orchestrator's bulk-submit pipeline, I need ResolveBatch to resolve up to 1000 MSISDNs per call so I avoid 1000 round trips on bulk campaigns.

Acceptance Criteria:

  • 500-entry warm-cache batch returns in P95 ≤ 80 ms via streaming gRPC response
  • > 1000 entries → RESOURCE_EXHAUSTED
  • Per-entry timeouts do not fail the whole batch
  • Duplicates deduplicated against cascade exactly once
  • Live-HLR fan-out bounded by per-MNO TPS governor

US-NI-003 · Per-MNO TPS governor for live HLR

Type: Feature | Points: 5

Description: As platform SRE, I need Redis-backed token buckets per MNO so outbound MAP SendRoutingInfoForSM never exceeds contracted SS7 quota.

Acceptance Criteria:

  • Redis bucket ni:tps:hlr:{mno} with capacity and refill from operator-management-service
  • Denied calls either wait up to tpsWaitMs or return stale-throttled answer
  • Bucket config refreshed on operator.config.changed.v1
  • /metrics exposes ni_hlr_tps_admitted_total and ni_hlr_tps_denied_total per MNO
  • Alert NIHLRThrottling when deny rate > 5 % over 5 min

US-NI-004 · Sovereign LRU & Redis cache classes

Type: Feature | Points: 3

Description: As the cache implementer, I need TTLs that vary by data class so stable attributes survive days while volatile attributes refresh in minutes.

Acceptance Criteria:

  • Redis namespace ni:hlr:{e164} with hash fields and per-class EXPIREAT
  • Class TTLs: LINE_TYPE 30 d, MNO 24 h, MNP 24 h, VLR/IMSI 5 min, EIR 24 h
  • LRU default 100 000 entries per pod, exact-LRU eviction
  • Per-region Redis only — no cross-region replication
  • /metrics cache-tier hit counters

US-NI-005 · HLR/HSS gateway with MAP and REST adapters

Type: Feature | Points: 8

Description: As the ni-hlr-gateway operator, I need both a SIGTRAN MAP adapter and an HTTPS REST adapter per MNO so each operator can be reached on its preferred protocol.

Acceptance Criteria:

  • SIGTRAN M3UA/SCTP association per MNO; MAP context shortMsgGatewayContext-v3
  • REST adapter: POST /v1/hlr/lookup with Bearer JWT; returns { imsi, vlr, line_type, mno_id }
  • Internal RPC LiveLookup(e164, mno_hint) with SPIFFE spiffe://ghasi.platform/ns/ni/sa/hlr-gateway
  • MAP timeout 1500 ms, REST timeout 800 ms → DEADLINE_EXCEEDED on breach
  • 0.1 % MAP pcap sampled to MinIO encrypted with ni-pcap-kek

US-NI-006 · Authoritative attribution write-through

Type: Feature | Points: 3

Description: As Postgres writer, I need every live HLR success UPSERTed into ni.msisdn_attribution so later callers benefit.

Acceptance Criteria:

  • INSERT … ON CONFLICT (e164) DO UPDATE updates mno/line_type/vlr/imsi_prefix/last_seen
  • Write is non-blocking; caller never waits
  • Write failure → retry via ni.attribution_outbox with exp back-off, max 6 attempts
  • ni.attribution.changed.v1 emitted only when mno or is_ported actually changed
  • Monthly partitioning on last_seen; detach partitions > 24 months to ni_archive

EP-NI-02 · MNP Registry & Daily Reconciliation with MNOs

Context: Daily authoritative refresh of Mobile Number Portability state from each MNO's published MNP delta file. Governs the ni.mnp_registry table and the LookupPorting fast-path.

US-NI-007 · Daily MNP file ingest

Type: Feature | Points: 8

Description: As reconciliation operator, I need a nightly CronJob that pulls and loads each MNO's MNP delta file.

Acceptance Criteria:

  • CronJob mnp-recon at 02:30 Asia/Kabul in kbl region
  • SFTP pull of {mno-sftp}/mnp/YYYY-MM-DD.csv per configured MNO
  • Files landed to MinIO ni-mnp-raw/{mno}/{yyyy}/{mm}/{dd}.csv with sha256 tag
  • INSERT … ON CONFLICT (msisdn) DO UPDATE when port_date > existing.port_date
  • Per-MNO ni.mnp.reconciled.v1 event with counts
  • Retries hourly until 23:00 same day; escalates to P1 afterwards

US-NI-008 · On-demand MNP refresh

Type: Feature | Points: 3

Description: As platform operator, I need an admin endpoint to trigger MNP ingest outside the nightly window for emergency corrections.

Acceptance Criteria:

  • POST /v1/admin/mnp/recon requires admin JWT with platform-admin
  • Publishes ni.recon.mnp.requested.v1, returns 202 with jobId
  • GET /v1/admin/mnp/recon/{jobId} returns { status, accepted, rejected, errors }
  • All admin operations emit audit.admin.action.v1

US-NI-009 · LookupPorting fast-path RPC

Type: Feature | Points: 3

Description: As routing-engine, I need LookupPorting(e164) to return { isPorted, currentMno, originalMno, portDate, source }.

Acceptance Criteria:

  • Cascade Redis → Postgres → default-unported in P95 ≤ 8 ms
  • MNP entry takes precedence over HLR observation
  • source{ mnp_registry, hlr_observation, default_unported }
  • Divergence between MNP and HLR emits ni.attribution.divergence.v1

US-NI-010 · MNP discrepancy reconciliation

Type: Feature | Points: 5

Description: As trust & safety analyst, I need to see MSISDNs where live HLR and MNP file disagree so I can investigate possible SIM-swap fraud.

Acceptance Criteria:

  • Nightly discrepancy job writes to ni.mnp_discrepancies
  • Severity HIGH when port_date < now() - '7 days'
  • Daily summary email to trust-safety@ghasi.local
  • ni.attribution.divergence.v1 consumed by fraud-intel-service

US-NI-011 · MNP registry hash chain integrity

Type: Feature | Points: 3

Description: As compliance auditor, I need cryptographic evidence that the MNP registry has not been mutated outside the ingest job.

Acceptance Criteria:

  • ni.mnp_recon_log rows include prev_chain_hash, this_chain_hash
  • Daily verification job recomputes chain end-to-end
  • Alert MNPChainBroken on mismatch
  • GET /v1/admin/mnp/chain/verify returns on-demand verification
  • Application role has no UPDATE/DELETE on ni.mnp_registry

US-NI-012 · MNP webhook to subscribed services

Type: Feature | Points: 3

Description: As routing-engine and sms-firewall-service, I need ni.attribution.changed.v1 so I can proactively warm or invalidate caches.

Acceptance Criteria:

  • Event payload { e164Hash, oldMno, newMno, isPorted, portDate, eventTime, source }
  • JetStream subject with interest retention, MaxAge 7 d
  • Durable consumer names re-ni-warmer and fw-ni-warmer; P95 handle ≤ 5 s
  • Schema registered under ni.attribution.changed.v1

EP-NI-03 · EIR/CEIR Cross-Check

Context: Daily refresh of blacklisted / greylisted IMEIs from ATRA and per-MNO CEIR feeds; opportunistic observation of MSISDN↔IMEI links through MAP responses; enrichment only (no blocking).

US-NI-013 · Daily EIR/CEIR file ingest

Type: Feature | Points: 5

Description: As reconciliation operator, I need daily ATRA + per-MNO CEIR files pulled and loaded into ni.eir_status.

Acceptance Criteria:

  • CronJob eir-recon at 03:30 Asia/Kabul
  • SFTP pull from ATRA and per-MNO CEIR endpoints
  • Luhn-validated IMEI, status enum BLACKLIST | GREYLIST | WHITELIST
  • Most-restrictive status wins across feeds
  • ni.eir.flagged.v1 emitted on transition into BLACKLIST
  • Raw files retained in MinIO ni-eir-raw for 24 months

US-NI-014 · LookupEir(imei) RPC

Type: Feature | Points: 3

Description: As sms-firewall-service and fraud-intel-service, I need LookupEir(imei) for device-stolen enrichment.

Acceptance Criteria:

  • Cascade Redis → Postgres in P95 ≤ 8 ms
  • Response { status, reasonCode, reportedBy[], lastUpdated, source }
  • Unknown IMEI → status = "UNKNOWN", not an error
  • Invalid Luhn → INVALID_ARGUMENT

Type: Feature | Points: 5

Description: As trust & safety analyst, I need an observational link between MSISDN and IMEI when MAP responses include IMEI.

Acceptance Criteria:

  • ni.msisdn_imei_observed upsert with (msisdn, imei, last_seen, observation_count)
  • No inference when MNO withholds IMEI
  • LookupMsisdnImei(msisdn) returns most-recent observation in P95 ≤ 8 ms
  • Annotated confidence = "observed", never sole basis for block

US-NI-016 · EIR-driven MT block recommendation

Type: Feature | Points: 3

Description: As sms-firewall-service, I need a recommendation when recipient MSISDN was observed on a blacklisted IMEI.

Acceptance Criteria:

  • ResolveMsisdn response carries optional eir_observation: { imei, status, observed_at }
  • sms-firewall-service rule pipeline consumes the field
  • audit.lookup.v1 records the presence of the observation
  • NI never auto-blocks — only enriches

EP-NI-04 · Public Lookup API

Context: Tenant-callable, billable Lookup API exposed through Kong. Returns attribution + staleness with per-tenant quotas and per-call metering.

US-NI-017 · Public Lookup API single endpoint

Type: Feature | Points: 5

Description: As tenant developer, I need GET /v1/lookup/{msisdn} so I can validate input and pre-route my own traffic.

Acceptance Criteria:

  • Requires Bearer JWT and X-Tenant-Id
  • Response { msisdn, country, mno, lineType, isPorted, originalMno, fetchedAt, stalenessSeconds, confidence }
  • ?maxStaleness=300 triggers fresh-tier billing
  • Invalid E.164 → 400 INVALID_MSISDN; quota over → 429 QUOTA_EXCEEDED
  • P95 ≤ 200 ms; P99 ≤ 500 ms

US-NI-018 · Public Lookup batch endpoint

Type: Feature | Points: 3

Description: As tenant developer, I need POST /v1/lookup/batch for up to 100 MSISDNs in one call.

Acceptance Criteria:

  • Max 100 entries, 413 on overflow
  • Results preserve input order
  • Each success metered as one billable lookup
  • Warm-cache P95 ≤ 800 ms for 100-entry batch

US-NI-019 · Per-call billing metering event

Type: Feature | Points: 3

Description: As billing-service, I need one billing.metering.recorded.v1 per chargeable lookup.

Acceptance Criteria:

  • Payload { tenantId, sku, quantity, occurredAt, requestId, msisdnHash }
  • Nats-Msg-Id = requestId for idempotent consumption
  • SKU lookup.fresh.v1 when live HLR was triggered; lookup.v1 otherwise
  • Publish failure sets X-Metering-Status: degraded; outbox retries
  • Internal callers are NOT metered

US-NI-020 · Per-tenant Lookup quota and rate limiting

Type: Feature | Points: 5

Description: As platform operator, I need per-tenant RPS and per-month quotas.

Acceptance Criteria:

  • Redis token bucket ni:tps:lookup:{tenantId} default 10/s
  • Month counter ni:quota:lookup:{tenantId}:{yyyymm} resets 1st 00:00 Asia/Kabul
  • 429 body { code: "QUOTA_EXCEEDED", scope, retryAfter }
  • audit.lookup.quota_exceeded.v1 emitted on 429
  • billing.tenant.plan.changed.v1 propagates new bucket within 60 s

US-NI-021 · Public Lookup audit log

Type: Feature | Points: 3

Description: As compliance officer, I need an immutable audit trail of every Public Lookup call.

Acceptance Criteria:

  • audit.lookup.v1 event per call with { tenantId, actorSub, msisdnHash, resultClass, resultMno, stalenessSeconds, ipAddress, userAgent, requestId, occurredAt }
  • Persisted by platform audit pipeline with WORM storage
  • MSISDN never published in cleartext to audit channel; only sha256(msisdn || tenantSalt)
  • Schema registered; breaking change → v2
  • Regulator query "all lookups by tenant T against +93... in last 90 d" runnable in ≤ 5 min