Number Intelligence Service (number-intelligence-service) — Service Overview
Status: populated
Version: 1.0
Owner: Messaging Core
Last updated: 2026-04-20
Companion: DOMAIN_MODEL · API_CONTRACTS · EVENT_SCHEMAS · DATA_MODEL
1. Purpose
The Number Intelligence Service is the platform's authoritative source of truth for MSISDN attribution on the Ghasi-SMS-Gateway national backbone. For any Afghan or foreign MSISDN presented to the platform (originating tenant request, MO inbound, CDR enrichment, fraud lookup, regulator query) it answers — within strict latency budgets and with sovereign data residency — three primary questions:
- Which MNO owns this number today? Resolved through a layered cascade: in-process LRU cache → Redis (
ni:hlr:{e164}) → Postgres (ni.msisdn_attribution) → live HLR/HSS query via the per-MNO MAP/SS7 gateway or REST adapter.
- Has this number been ported (MNP) and to whom? Resolved against the platform's MNP registry (
ni.mnp_registry) which is reconciled against MNO MNP files daily and on demand.
- Is the device tied to this MSISDN flagged in EIR/CEIR? Resolved against the EIR/CEIR cross-check store (
ni.eir_status) populated by daily MNO and ATRA stolen-device feeds.
The service exposes three callers with very different SLOs:
| Caller surface | Protocol | Caller | Latency SLO | Cache class |
|---|
| Internal lookup (routing fast-path) | gRPC (mTLS+SPIFFE) | routing-engine, sms-firewall-service, channel-router-service, fraud-intel-service | P95 ≤ 15 ms | LRU + Redis hot |
| Public Lookup API (tenant-billable) | HTTPS REST through Kong | Tenant client (JWT + API key) | P95 ≤ 200 ms | Redis cached, billed per chargeable lookup |
| Bulk reconciliation jobs | NATS JetStream (ni.recon.*) | scheduler, cdr-mediation-service, regulator-portal-service | Bulk; P99 ≤ 4 h end-to-end | Postgres direct |
Number intelligence is read-heavy (peak ≥ 25 k QPS aggregate from routing-engine alone during OTP storms). Write traffic comes only from reconciliation jobs and operator-management pushes, never from tenant traffic.
2. Bounded Context
| Dimension | Value |
|---|
| Domain | Telecom Numbering & Subscriber Attribution |
| Owner squad | Messaging Core |
| Deployment unit | Kubernetes Deployment number-intelligence-service (control plane) + DaemonSet ni-hlr-gateway (data plane, per region) |
| Communication style | Inbound: gRPC (mTLS), HTTPS REST · Outbound: SS7/MAP via per-MNO HLR gateway, REST/SFTP to MNO MNP feeds, NATS JetStream |
| Storage | PostgreSQL schema ni (Patroni HA) · Redis (hot cache, per-region) · MinIO (raw MNP / EIR file landing) |
| Failure mode | Fail-degraded: stale-but-typed responses are preferred over errors. A staleness_seconds field is always returned. |
| Region pinning | Active-active in kbl and mzr (per ADR-0004 §2). Reconciliation jobs run in kbl only. |
This service is the canonical home for the bounded context "Numbering Attribution and Identity of an MSISDN/IMEI". It is not the home for:
- Sender-ID alphanumeric registration →
sender-id-registry-service
- Short-code / MSISDN inventory leasing →
numbering-service
- Per-MNO routing weights and TPS budgets →
routing-engine + operator-management-service
3. Responsibilities
| # | Responsibility |
|---|
| R1 | Resolve (e164) → { mno, line_type, country, isPorted, originalMno, hlrAvailable, source, fetchedAt, ttl } via the four-tier cascade |
| R2 | Maintain ni.msisdn_attribution Postgres table (≈ 200 M rows at national scale) with monthly partitioning by last_seen |
| R3 | Maintain ni.mnp_registry and reconcile with MNO MNP feeds daily by 03:00 Asia/Kabul |
| R4 | Maintain ni.eir_status and reconcile against ATRA + per-MNO CEIR feeds daily by 04:00 Asia/Kabul |
| R5 | Operate the per-MNO HLR gateway (MAP SendRoutingInfoForSM over SIGTRAN; or REST adapter where the MNO exposes one) with per-MNO TPS governor backed by Redis token bucket |
| R6 | Expose ResolveMsisdn, ResolveBatch, LookupPorting, LookupEir, WarmCache gRPC RPCs to internal services |
| R7 | Expose GET /v1/lookup/{msisdn} and POST /v1/lookup/batch HTTPS endpoints for tenants, with billing metering events to billing-service |
| R8 | Publish ni.attribution.changed.v1, ni.mnp.reconciled.v1, ni.eir.flagged.v1 NATS JetStream events |
| R9 | Provide staleness_seconds and confidence on every response; never silently return arbitrarily-old data without disclosure |
| R10 | Enforce sovereign data residency: MSISDN, IMEI, and reconciliation files never leave Afghan regions in plaintext |
4. Non-Responsibilities
- Does not decide which carrier to dispatch on (
routing-engine does, using NI as input).
- Does not maintain MO/MT firewall rules (
sms-firewall-service does).
- Does not charge tenants or maintain billing balances (
billing-service does — NI emits metered events only).
- Does not authenticate end users (
auth-service does); it trusts SPIFFE workload identities for internal calls.
- Does not resolve sender alphanumeric identities (
sender-id-registry-service).
- Does not perform deep packet inspection on SMS payloads — content classification is owned by
compliance-engine.
5. Upstream / Downstream Dependencies
| Direction | System | Protocol | Purpose |
|---|
| Inbound caller | routing-engine | gRPC (mTLS+SPIFFE) | Per-MT carrier resolution before SMPP dispatch |
| Inbound caller | sms-firewall-service | gRPC (mTLS+SPIFFE) | Origin attribution and MNP check on inbound MO |
| Inbound caller | channel-router-service | gRPC (mTLS+SPIFFE) | Capability check (line_type=MOBILE for SMS, etc.) |
| Inbound caller | fraud-intel-service | gRPC (mTLS+SPIFFE) | AIT and SIM-box graph enrichment |
| Inbound caller | Tenant via Kong | HTTPS REST | Public Lookup API (billable) |
| Outbound | Per-MNO HLR/HSS | SIGTRAN MAP SendRoutingInfoForSM (preferred) or HTTPS REST adapter | Live MSISDN attribution |
| Outbound | MNO MNP feed | SFTP (CSV/XML) or HTTPS pull | Daily MNP delta reconciliation |
| Outbound | ATRA / per-MNO CEIR feed | SFTP (CSV) | Daily EIR/stolen-device list |
| Outbound | PostgreSQL ni schema | TCP (pg driver, Patroni) | Persistence, partitioned by month |
| Outbound | Redis (per-region cluster) | TCP (RESP) | Hot cache (ni:hlr:*, ni:mnp:*, ni:eir:*, TPS token buckets) |
| Outbound | NATS JetStream | TCP | Domain events (ni.attribution.changed.v1, ni.mnp.reconciled.v1, ni.eir.flagged.v1) |
| Outbound | billing-service | NATS event billing.metering.recorded.v1 | One metering event per chargeable Public Lookup API call |
| Outbound | audit-service (logical, via NATS) | NATS event audit.lookup.v1 | Auditable record of every Public Lookup API call (tenant, msisdn-hash, result class) |
6. Runtime Topology
| Component | Replicas (per region) | Notes |
|---|
number-intelligence-service (gRPC + REST) | 6 in kbl, 4 in mzr (HPA min) | Stateless. Scales on grpc_inflight_requests + cpu. |
ni-hlr-gateway DaemonSet | 1 pod per data-plane node | Holds SIGTRAN sockets and per-MNO REST connections; owns (MNO × bind) affinity. |
Reconciliation Jobs | kbl only, cron | mnp-recon 02:30 daily, eir-recon 03:30 daily, attribution-decay hourly. |
Postgres ni cluster | Patroni 1 primary + 2 sync standbys per region; logical replication kbl→dxb audit-only | RPO ≤ 5 s intra-region. |
Redis ni cluster | 6-node Sentinel per region | Hot cache; no cross-region replication. |
7. High-Level Resolution Flow
9. Key Design Decisions
| Decision | Rationale |
|---|
| Four-tier cache cascade (LRU → Redis → Postgres → live HLR) | OTP traffic produces extreme repeat-MSISDN frequency. Live HLR queries are expensive (SS7 charges) and rate-limited by MNOs. Tiered caching brings effective hit ratio ≥ 99.5 %. |
| TTL-by-class, not single TTL | LINE_TYPE is essentially permanent (TTL 30 d). MNP changes daily for at most ≤ 0.1 % of base — TTL 24 h. Live HLR VLR is volatile — TTL 5 min. Single TTL would either waste MNO quota or risk staleness. |
staleness_seconds always returned | Callers can choose policy: routing-engine accepts up to 24 h staleness for MNO; fraud-intel-service requires fresh (forces refresh on > 60 s). |
| Per-MNO TPS governor in Redis | MNOs cap our SS7 traffic. Token bucket per (mno, op) ensures we never exceed contracted TPS even under burst load. |
| Reconciliation is daily and authoritative | MNP and EIR data are released daily by MNOs/ATRA. Live HLR cannot tell us that a number was ported yesterday — only the MNP feed can. Daily recon at 03:00 Kabul is the contract. |
| Public Lookup API is billable per call, not per byte | Aligns with industry pricing (Twilio Lookup, Telnyx LRN). One metering event per chargeable lookup. |
| MSISDN never leaves Afghan regions in plaintext | Sovereign data residency. dxb cold-DR holds only AES-GCM-wrapped Postgres backups; the unwrap key never leaves kbl HSM. |
| Fail-degraded, not fail-closed | Unlike compliance-engine, NI is an enrichment service. If it returns no answer, downstream callers (routing-engine) have a fallback (default MNO from MSISDN prefix table). Returning a stale answer with confidence: low is always safer than returning none. |
| Hash MSISDN in audit logs and metrics | sha256(msisdn + tenantPepper) is logged; raw MSISDN stays in Postgres only. |
10. Latency and Throughput Budget
| Path | Cache class | Target P50 | Target P95 | Target P99 |
|---|
| LRU hit | in-process | 0.2 ms | 0.5 ms | 1 ms |
| Redis hit | hot | 1.5 ms | 4 ms | 8 ms |
| Postgres hit | warm | 6 ms | 15 ms | 30 ms |
| Live HLR (MAP) | cold | 250 ms | 600 ms | 1200 ms |
| Live HLR (REST adapter) | cold | 80 ms | 200 ms | 500 ms |
Aggregate ResolveMsisdn SLO: P95 ≤ 15 ms (assuming ≥ 99 % cache hit rate). Public Lookup API SLO: P95 ≤ 200 ms (allows up to 1 cold-tier miss per request).
11. Cross-Service Invariants
- Authoritative source. No other service may directly query an MNO HLR/HSS or maintain a parallel MNP table. All MSISDN attribution flows through this service.
- Confidence floor. Responses with
confidence: unknown MUST NOT be used as the sole input to a routing decision; routing-engine falls back to its prefix table.
- Audit completeness. Every Public Lookup API call produces exactly one
audit.lookup.v1 NATS event and exactly one billing.metering.recorded.v1 event. Reconciliation jobs produce one ni.mnp.reconciled.v1 per file processed.
- Backward compatibility. Schema changes to gRPC RPCs follow the platform's evolution policy (add-only fields; new RPCs for breaking changes).
12. References
- ADR-0004 §3 (new bounded contexts) — defines this service's scope
docs/07-epics-and-user-stories.md §6.2 — EP-NI-01..04
docs/13-security-compliance-tenancy.md — sovereign data residency rules
services/routing-engine/SERVICE_OVERVIEW.md — primary internal consumer
services/sms-firewall-service/SERVICE_OVERVIEW.md — secondary internal consumer
services/billing-service/SERVICE_OVERVIEW.md — metering contract
- 3GPP TS 29.002 (MAP) — MAP SendRoutingInfoForSM operation reference