numbering-service — AI Integration
Version: 1.0 Status: Draft Owner: Commerce Engineering + Trust & Safety Last Updated: 2026-04-21 Companion: DOMAIN_MODEL · APPLICATION_LOGIC · SECURITY_MODEL · ../fraud-intel-service/SERVICE_OVERVIEW.md
1. Purpose
Unlike compliance-engine (which uses LLMs for content classification) or fraud-intel-service (which runs predictive ML on messaging patterns), numbering-service core logic is deterministic and does not depend on AI for correctness. Inventory allocation, lease validation, reservation TTL, quarantine enforcement, and regulator export are all deterministic rule-based workflows with strict CAS semantics.
AI integration in numbering-service is therefore exclusively observational / signal-producing: the service emits anomaly signals that fraud-intel-service uses to score tenant risk, and it consumes advisory signals back. No AI verdict ever changes a lease decision in the hot path. This is by design — a misbehaving model must not be able to block a legitimate tenant's sender-ID allocation, nor silently approve an abuse-pattern allocation.
2. Where AI Lives in the Platform Relative to Numbering
┌────────────────────┐ emits anomaly signal ┌───────────────────────┐
│ numbering-service │ ──────────────────────► │ fraud-intel-service │
│ (deterministic) │ │ (predictive ML) │
└────────────────────┘ └───────────────────────┘
▲ │
│ advisory rate-shape (NOT authoritative) │
└──────────────────────────────────────────────────┘
- Numbering is the source of truth for allocation.
- Fraud-intel is the source of risk score.
- Numbering applies advisory rate-shape (not hard rejection) on outputs from fraud-intel.
3. Anomaly Signals Emitted
All signals are published to NATS subject numbering.anomaly.v1 (stream: NUMBERING_OPS, retention 30 d) and consumed by fraud-intel-service. No action is taken by numbering-service on these signals beyond publishing.
| Signal kind | Trigger | Payload |
|---|---|---|
RESERVATION_BURST | Tenant reserves > 30 identifiers within 60 s | {tenantId, count, windowSecs, identifiers[]} |
VANITY_HARVEST_PATTERN | Tenant reserves > 5 vanity short codes within 10 min | {tenantId, count, vanityCodes[]} |
ALPHA_HOMOGLYPH_PATTERN | Reserved alpha-ID visually similar to an existing verified alpha (edit distance ≤ 2 on normalised form, excluding own tenant's) | {tenantId, newAlpha, similarTo[], editDistance} |
QUARANTINE_OVERRIDE_SPIKE | > 3 admin quarantine-overrides within 24 h | {count, overrideAdmins[]} |
CROSS_TENANT_CLAIM_ATTEMPT | Tenant repeatedly loses CAS on numbers they never owned | {tenantId, attempts, targetNumbers[]} |
MASS_RELEASE_PATTERN | Tenant releases > 50 reservations within 10 min (possible inventory-scrape) | {tenantId, count} |
Signal payloads never include raw PII beyond the tenant's own identifiers.
Signal schema
interface NumberingAnomalySignal {
schemaVersion: '1';
eventId: string;
kind: 'RESERVATION_BURST' | 'VANITY_HARVEST_PATTERN' | 'ALPHA_HOMOGLYPH_PATTERN'
| 'QUARANTINE_OVERRIDE_SPIKE' | 'CROSS_TENANT_CLAIM_ATTEMPT' | 'MASS_RELEASE_PATTERN';
tenantId: string | null; // null for platform-wide signals (e.g. override spike)
severity: 'LOW' | 'MEDIUM' | 'HIGH';
windowSeconds: number;
count: number;
details: Record<string, unknown>;
traceId: string;
at: string; // RFC 3339
}
4. Signals Consumed (Advisory Only)
numbering-service subscribes to fraud.signal.v1 from fraud-intel-service with a pure-advisory contract. The only action numbering takes on receipt is to soft-shape reservation rate limits, not to reject allocations.
| Consumed signal | Numbering action |
|---|---|
fraud.signal.v1 { kind: TENANT_HIGH_RISK } | Reduce tenant's maxActiveReservations effective quota by 50 % for 24 h. Tenant can still hold / assign. Recovered automatically on signal expiry. |
fraud.signal.v1 { kind: TENANT_SUSPENDED } | No-op — numbering already subscribes to compliance.tenant.suspended.v1 for hard enforcement. Fraud-intel signal is advisory. |
Important: fraud-intel does not have authority to recall a lease. That authority sits with compliance-engine and only compliance-engine's compliance.tenant.suspended.v1 triggers numbering's bulk-recall UC-15.
5. Homoglyph Detection (Alpha-ID Submission)
When a tenant attempts to Reserve or Assign an alpha-ID, numbering-service performs a synchronous lightweight check (not ML — it's deterministic regex + edit-distance):
- Normalise to uppercase ASCII (map confusables:
0/O,1/l/I,5/S,2/Z,6/G,8/B). - Compute Damerau-Levenshtein distance against all currently-LEASED alpha-IDs owned by other tenants.
- If distance ≤ 2, emit
ALPHA_HOMOGLYPH_PATTERNsignal. Do not block — the tenant is still permitted to reserve.sender-id-registry-serviceperforms the authoritative KYC + verification; numbering signals are advisory inputs to fraud-intel.
Rationale: numbering cannot judge whether M0BI-BANK vs MOBI-BANK is a legitimate brand of an approved corporate entity or a phishing attempt — that's sender-id-registry's job via notarised KYC. Numbering's role is to surface the anomaly for investigation.
6. Regulator-Export Anomaly Flagging (Future)
Not in v1. Planned for v2 (2027 Q1):
- A small ML classifier (logistic regression on inventory features: operator share skew, short-code-to-MSISDN ratio, quarantine-override rate) flags monthly regulator exports that deviate significantly from historical means. Flagged exports are pushed to a "pre-submission review" queue in
regulator-portal-servicebefore ATRA submission. - Same deterministic-core principle: the classifier flags for review; it never blocks submission.
7. Model Governance
No AI/ML models are hosted inside numbering-service. Models that consume or emit numbering signals are owned by fraud-intel-service per its AI governance (see ../fraud-intel-service/AI_INTEGRATION.md). Numbering-service's responsibility is limited to:
- Publishing high-fidelity anomaly signals (auditable, deterministic triggers).
- Consuming advisory signals with no hot-path impact.
This separation makes numbering-service's behaviour fully reproducible from its deterministic rules — a critical property for ATRA regulator audits, which require every allocation decision to be traceable to an explicit rule, not a black-box model.
8. Privacy Considerations
- Anomaly signals never include: message bodies, subscriber phone numbers (as distinct from inventory MSISDNs, which are platform-owned assets), KYC document contents, or any sender-id-registry verification artefacts.
- Signal payloads include tenant IDs and inventory values (MSISDN / short-code / alpha-ID) — these are INTERNAL per platform classification.
- The signal bus (
NUMBERING_OPSstream) is restricted by NATS ACL tofraud-intel-serviceconsumer group only; no analytics or external consumers.
9. Testing
- Unit tests on signal-emission thresholds:
RESERVATION_BURST > 30/60s, homoglyph edit-distance ≤ 2, etc. - Deterministic replay tests: given a signal-generating scenario (100 reservations within 60 s), verify exactly one signal emitted per threshold breach, and no duplicates on retry.
- Multi-region: verify that the same tenant action in
kblandmzrdoes not produce double-counted signals (signals are region-tagged and aggregated at fraud-intel).
10. Future Enhancements
| Enhancement | Timeline | Rationale |
|---|---|---|
| Regulator-export deviation classifier | 2027 Q1 | Pre-submission review of unusual inventory shapes |
| Per-tenant behavioural baseline | 2027 Q2 | Replace static thresholds with per-tenant z-score |
| Alpha-ID brand-lookalike classifier (ML) | 2027 Q3 | Augment deterministic homoglyph detection with a small transformer on brand corpora |
| Auto-suggest pool quota adjustments | 2028 | Recommend quota changes based on tenant usage trends |
All future enhancements preserve the deterministic-core, advisory-AI principle.
End of AI_INTEGRATION.md