Skip to main content

Number Intelligence Service (number-intelligence-service) — Service Overview

Status: populated Version: 1.0 Owner: Messaging Core Last updated: 2026-04-20 Companion: DOMAIN_MODEL · API_CONTRACTS · EVENT_SCHEMAS · DATA_MODEL


1. Purpose

The Number Intelligence Service is the platform's authoritative source of truth for MSISDN attribution on the Ghasi-SMS-Gateway national backbone. For any Afghan or foreign MSISDN presented to the platform (originating tenant request, MO inbound, CDR enrichment, fraud lookup, regulator query) it answers — within strict latency budgets and with sovereign data residency — three primary questions:

  1. Which MNO owns this number today? Resolved through a layered cascade: in-process LRU cache → Redis (ni:hlr:{e164}) → Postgres (ni.msisdn_attribution) → live HLR/HSS query via the per-MNO MAP/SS7 gateway or REST adapter.
  2. Has this number been ported (MNP) and to whom? Resolved against the platform's MNP registry (ni.mnp_registry) which is reconciled against MNO MNP files daily and on demand.
  3. Is the device tied to this MSISDN flagged in EIR/CEIR? Resolved against the EIR/CEIR cross-check store (ni.eir_status) populated by daily MNO and ATRA stolen-device feeds.

The service exposes three callers with very different SLOs:

Caller surfaceProtocolCallerLatency SLOCache class
Internal lookup (routing fast-path)gRPC (mTLS+SPIFFE)routing-engine, sms-firewall-service, channel-router-service, fraud-intel-serviceP95 ≤ 15 msLRU + Redis hot
Public Lookup API (tenant-billable)HTTPS REST through KongTenant client (JWT + API key)P95 ≤ 200 msRedis cached, billed per chargeable lookup
Bulk reconciliation jobsNATS JetStream (ni.recon.*)scheduler, cdr-mediation-service, regulator-portal-serviceBulk; P99 ≤ 4 h end-to-endPostgres direct

Number intelligence is read-heavy (peak ≥ 25 k QPS aggregate from routing-engine alone during OTP storms). Write traffic comes only from reconciliation jobs and operator-management pushes, never from tenant traffic.


2. Bounded Context

DimensionValue
DomainTelecom Numbering & Subscriber Attribution
Owner squadMessaging Core
Deployment unitKubernetes Deployment number-intelligence-service (control plane) + DaemonSet ni-hlr-gateway (data plane, per region)
Communication styleInbound: gRPC (mTLS), HTTPS REST · Outbound: SS7/MAP via per-MNO HLR gateway, REST/SFTP to MNO MNP feeds, NATS JetStream
StoragePostgreSQL schema ni (Patroni HA) · Redis (hot cache, per-region) · MinIO (raw MNP / EIR file landing)
Failure modeFail-degraded: stale-but-typed responses are preferred over errors. A staleness_seconds field is always returned.
Region pinningActive-active in kbl and mzr (per ADR-0004 §2). Reconciliation jobs run in kbl only.

This service is the canonical home for the bounded context "Numbering Attribution and Identity of an MSISDN/IMEI". It is not the home for:

  • Sender-ID alphanumeric registration → sender-id-registry-service
  • Short-code / MSISDN inventory leasing → numbering-service
  • Per-MNO routing weights and TPS budgets → routing-engine + operator-management-service

3. Responsibilities

#Responsibility
R1Resolve (e164) → { mno, line_type, country, isPorted, originalMno, hlrAvailable, source, fetchedAt, ttl } via the four-tier cascade
R2Maintain ni.msisdn_attribution Postgres table (≈ 200 M rows at national scale) with monthly partitioning by last_seen
R3Maintain ni.mnp_registry and reconcile with MNO MNP feeds daily by 03:00 Asia/Kabul
R4Maintain ni.eir_status and reconcile against ATRA + per-MNO CEIR feeds daily by 04:00 Asia/Kabul
R5Operate the per-MNO HLR gateway (MAP SendRoutingInfoForSM over SIGTRAN; or REST adapter where the MNO exposes one) with per-MNO TPS governor backed by Redis token bucket
R6Expose ResolveMsisdn, ResolveBatch, LookupPorting, LookupEir, WarmCache gRPC RPCs to internal services
R7Expose GET /v1/lookup/{msisdn} and POST /v1/lookup/batch HTTPS endpoints for tenants, with billing metering events to billing-service
R8Publish ni.attribution.changed.v1, ni.mnp.reconciled.v1, ni.eir.flagged.v1 NATS JetStream events
R9Provide staleness_seconds and confidence on every response; never silently return arbitrarily-old data without disclosure
R10Enforce sovereign data residency: MSISDN, IMEI, and reconciliation files never leave Afghan regions in plaintext

4. Non-Responsibilities

  • Does not decide which carrier to dispatch on (routing-engine does, using NI as input).
  • Does not maintain MO/MT firewall rules (sms-firewall-service does).
  • Does not charge tenants or maintain billing balances (billing-service does — NI emits metered events only).
  • Does not authenticate end users (auth-service does); it trusts SPIFFE workload identities for internal calls.
  • Does not resolve sender alphanumeric identities (sender-id-registry-service).
  • Does not perform deep packet inspection on SMS payloads — content classification is owned by compliance-engine.

5. Upstream / Downstream Dependencies

DirectionSystemProtocolPurpose
Inbound callerrouting-enginegRPC (mTLS+SPIFFE)Per-MT carrier resolution before SMPP dispatch
Inbound callersms-firewall-servicegRPC (mTLS+SPIFFE)Origin attribution and MNP check on inbound MO
Inbound callerchannel-router-servicegRPC (mTLS+SPIFFE)Capability check (line_type=MOBILE for SMS, etc.)
Inbound callerfraud-intel-servicegRPC (mTLS+SPIFFE)AIT and SIM-box graph enrichment
Inbound callerTenant via KongHTTPS RESTPublic Lookup API (billable)
OutboundPer-MNO HLR/HSSSIGTRAN MAP SendRoutingInfoForSM (preferred) or HTTPS REST adapterLive MSISDN attribution
OutboundMNO MNP feedSFTP (CSV/XML) or HTTPS pullDaily MNP delta reconciliation
OutboundATRA / per-MNO CEIR feedSFTP (CSV)Daily EIR/stolen-device list
OutboundPostgreSQL ni schemaTCP (pg driver, Patroni)Persistence, partitioned by month
OutboundRedis (per-region cluster)TCP (RESP)Hot cache (ni:hlr:*, ni:mnp:*, ni:eir:*, TPS token buckets)
OutboundNATS JetStreamTCPDomain events (ni.attribution.changed.v1, ni.mnp.reconciled.v1, ni.eir.flagged.v1)
Outboundbilling-serviceNATS event billing.metering.recorded.v1One metering event per chargeable Public Lookup API call
Outboundaudit-service (logical, via NATS)NATS event audit.lookup.v1Auditable record of every Public Lookup API call (tenant, msisdn-hash, result class)

6. Runtime Topology

ComponentReplicas (per region)Notes
number-intelligence-service (gRPC + REST)6 in kbl, 4 in mzr (HPA min)Stateless. Scales on grpc_inflight_requests + cpu.
ni-hlr-gateway DaemonSet1 pod per data-plane nodeHolds SIGTRAN sockets and per-MNO REST connections; owns (MNO × bind) affinity.
Reconciliation Jobskbl only, cronmnp-recon 02:30 daily, eir-recon 03:30 daily, attribution-decay hourly.
Postgres ni clusterPatroni 1 primary + 2 sync standbys per region; logical replication kbl→dxb audit-onlyRPO ≤ 5 s intra-region.
Redis ni cluster6-node Sentinel per regionHot cache; no cross-region replication.

7. High-Level Resolution Flow


8. Position in the Platform


9. Key Design Decisions

DecisionRationale
Four-tier cache cascade (LRU → Redis → Postgres → live HLR)OTP traffic produces extreme repeat-MSISDN frequency. Live HLR queries are expensive (SS7 charges) and rate-limited by MNOs. Tiered caching brings effective hit ratio ≥ 99.5 %.
TTL-by-class, not single TTLLINE_TYPE is essentially permanent (TTL 30 d). MNP changes daily for at most ≤ 0.1 % of base — TTL 24 h. Live HLR VLR is volatile — TTL 5 min. Single TTL would either waste MNO quota or risk staleness.
staleness_seconds always returnedCallers can choose policy: routing-engine accepts up to 24 h staleness for MNO; fraud-intel-service requires fresh (forces refresh on > 60 s).
Per-MNO TPS governor in RedisMNOs cap our SS7 traffic. Token bucket per (mno, op) ensures we never exceed contracted TPS even under burst load.
Reconciliation is daily and authoritativeMNP and EIR data are released daily by MNOs/ATRA. Live HLR cannot tell us that a number was ported yesterday — only the MNP feed can. Daily recon at 03:00 Kabul is the contract.
Public Lookup API is billable per call, not per byteAligns with industry pricing (Twilio Lookup, Telnyx LRN). One metering event per chargeable lookup.
MSISDN never leaves Afghan regions in plaintextSovereign data residency. dxb cold-DR holds only AES-GCM-wrapped Postgres backups; the unwrap key never leaves kbl HSM.
Fail-degraded, not fail-closedUnlike compliance-engine, NI is an enrichment service. If it returns no answer, downstream callers (routing-engine) have a fallback (default MNO from MSISDN prefix table). Returning a stale answer with confidence: low is always safer than returning none.
Hash MSISDN in audit logs and metricssha256(msisdn + tenantPepper) is logged; raw MSISDN stays in Postgres only.

10. Latency and Throughput Budget

PathCache classTarget P50Target P95Target P99
LRU hitin-process0.2 ms0.5 ms1 ms
Redis hithot1.5 ms4 ms8 ms
Postgres hitwarm6 ms15 ms30 ms
Live HLR (MAP)cold250 ms600 ms1200 ms
Live HLR (REST adapter)cold80 ms200 ms500 ms

Aggregate ResolveMsisdn SLO: P95 ≤ 15 ms (assuming ≥ 99 % cache hit rate). Public Lookup API SLO: P95 ≤ 200 ms (allows up to 1 cold-tier miss per request).


11. Cross-Service Invariants

  1. Authoritative source. No other service may directly query an MNO HLR/HSS or maintain a parallel MNP table. All MSISDN attribution flows through this service.
  2. Confidence floor. Responses with confidence: unknown MUST NOT be used as the sole input to a routing decision; routing-engine falls back to its prefix table.
  3. Audit completeness. Every Public Lookup API call produces exactly one audit.lookup.v1 NATS event and exactly one billing.metering.recorded.v1 event. Reconciliation jobs produce one ni.mnp.reconciled.v1 per file processed.
  4. Backward compatibility. Schema changes to gRPC RPCs follow the platform's evolution policy (add-only fields; new RPCs for breaking changes).

12. References

  • ADR-0004 §3 (new bounded contexts) — defines this service's scope
  • docs/07-epics-and-user-stories.md §6.2 — EP-NI-01..04
  • docs/13-security-compliance-tenancy.md — sovereign data residency rules
  • services/routing-engine/SERVICE_OVERVIEW.md — primary internal consumer
  • services/sms-firewall-service/SERVICE_OVERVIEW.md — secondary internal consumer
  • services/billing-service/SERVICE_OVERVIEW.md — metering contract
  • 3GPP TS 29.002 (MAP) — MAP SendRoutingInfoForSM operation reference