Skip to main content

Channel Router Service — AI Integration

Version: 1.0 Status: Draft Owner: Messaging Core + Platform ML Last Updated: 2026-04-21 Companion: APPLICATION_LOGIC · SECURITY_MODEL · DOMAIN_MODEL

ML in the channel-router is applied narrowly: channel-preference learning (predicting which channel is most likely to succeed for a given recipient), adaptive ladder ordering, and an optional session intent classifier for conversational sessions. No cloud LLM is on the hot path; no raw MSISDN, body, or PII is ever sent to a remote inference endpoint.


1. Scope and non-scope

In scope

  • Channel-preference scoring: learn per-recipient P(success | channel) and adaptively reorder the ladder.
  • STOP-keyword extension (language-detection): identify local-language STOP equivalents beyond the hard-coded set.
  • Session-intent classification: detect "question", "complaint", "STOP", "out-of-office" patterns on MO text to annotate tenant webhook payload.
  • Voice-OTP success prediction: feature voice_answer_rate per recipient (time-of-day, past ANSWERED history).

Out of scope

  • Content-policy classification — owned by compliance-engine.
  • Fraud / SIM-box detection — owned by sms-firewall-service and fraud-intel-service.
  • Sender-ID verification — owned by sender-id-registry-service.

2. Model inventory

ModelTypeHostingInference budgetCache TTLFallback
channel-preference-v1Gradient-boosted tree (LightGBM) on hashed featuresOn-prem model-server (Triton)5 ms P95300 s per (tenantId, msisdnHash, useCase)Static preference order from recipient_profiles.channel_preferences
stop-keyword-multilang-v1Small fine-tuned BERT (multilingual-mini), fine-tuned on Pashto/Dari/Farsi/Arabic STOP corporaOn-prem Triton15 ms P9560 s per (body_hash)Exact-match keyword list (US-CHAN-021 defaults)
session-intent-v1Small classification head on multilingual-miniOn-prem Triton15 ms P9560 s per (body_hash)Intent marked UNKNOWN

All models run on-prem in the np-ml namespace. No model weights leave the data-sovereignty boundary (ADR-0004 §11).


3. Feature engineering

All features are hashed / aggregated — no raw MSISDN or body.

3.1 Channel-preference features

FeatureSourceType
msisdn_bucketFirst 5 chars of msisdn_hashcategorical (hashed)
has_wa_business_tristaterecipient_profiles.has_whatsapp_businesscategorical
voice_answer_rate_7ddelivery_attempts rolling aggregatenumeric
sms_delivery_rate_7ddlr-processor feednumeric
last_successful_channelrecipient_profiles.last_successful_channelcategorical
time_of_day_bucketrequest time → {morning, afternoon, evening, night}categorical
use_case`otptxn
mnofrom number-intelligence-service (numint)categorical
languagetenant-default or detected from templatecategorical
segmentsSMS segment countnumeric
tenant_tiertenant's compliance tiercategorical

3.2 Session-intent features

  • Body SHA-256 (as cache key only; model sees body tokens only inside the sandboxed inference pod)
  • Body length, segment count
  • Detected script (fa, ps, ar, en, mixed)
  • Prior MT template class (OTP, marketing, alert) — from conversations.last_mt_message_id lookup

4. Online inference architecture

Fail-closed for ML: every inference has a 10 ms deadline (stop-keyword: 15 ms). Budget exhaustion → fallback to static preference; metric chan_ml_budget_exceeded_total incremented. The fallback never changes a BLOCK decision (compliance owns BLOCK), but it may degrade ladder ordering quality.

Feature flag CHAN_ML_PREFERENCE_ORDERING_ENABLED (default true in prod; default false during shadow mode). When disabled, the ladder uses the static channel_preferences order from recipient_profiles.


5. Training

  • Training data: delivery_attempts + fallback_executions partitions (13 m hot).
  • Labels: binary success (delivered / delivered_read / ANSWERED with played OTP) vs failure.
  • Pipeline: Nightly job in np-ml (/ml-pipelines/channel-preference-v1.yml) using Kubeflow; model artefacts written to s3://ghasi-model-registry/channel/channel-preference-v1/{version}/.
  • Rollout: canary deploy 10% → 50% → 100% over 72 h; guardrails on notification_delivery_success_rate regression (must not drop > 0.5 pp).
  • Drift monitoring: PSI (Population Stability Index) on top-5 features; alert ChannelMlFeatureDrift when PSI > 0.2.

6. Adaptive ladder ordering

The ML model returns a per-channel score p_success ∈ [0, 1]. The ordering algorithm:

  1. Start from the tenant policy's static ladder.
  2. Filter by consent + compliance gating (UC-01 step 5–6).
  3. If CHAN_ML_PREFERENCE_ORDERING_ENABLED and discoveryState = STABLE:
    • Compute per-step expected-value score: p_success * weight(step) - cost(step) * costWeight.
    • Stable-sort the filtered ladder by descending score within a bounded reorder window of 2 positions (to respect tenant-declared order for compliance reasons).
  4. Truncate to cost-cap budget (UC-01 step 11).

Guardrail — the first step of a tenant policy with useCase = otp may never be reordered out of position 0 (OTP delivery has regulatory and UX implications that override ML preference).


7. Session-intent classification

Invoked on MO ingest in chan-mo-router after tenant-webhook delivery attempt (so classification latency never impacts MO delivery SLA):

  • Input: { bodyTokens, script, prior_mt_class, turn_count }
  • Output: { intent: STOP|QUESTION|COMPLAINT|CONFIRMATION|OUT_OF_OFFICE|UNKNOWN, confidence: [0..1] }
  • Result appended to the tenant webhook payload as optional aiIntent field; tenants opt-in per-route.

STOP handling note. The ML STOP classifier is additive to the exact-match STOP keyword list — it may UPGRADE a message to STOP but NEVER downgrade a keyword-matched STOP. That is: if either the exact-match OR the ML classifier says STOP (with confidence ≥ 0.85), the session is closed.


8. Privacy, PII, sovereignty

  • No external LLM calls on the hot path. A startup check reads CHAN_EXTERNAL_LLM_ENABLED; the pod refuses to start if set to true (same posture as sms-firewall-service).
  • Bodies leave the pod only into on-prem Triton over mTLS; Triton logs are PII-masked.
  • Feature hashing guarantees no raw MSISDN is written to the feature-store (s3://ghasi-feature-store/).
  • Model weights never cross data-sovereignty boundary; training on-prem only.
  • Audit: every inference call increments chan_ml_inference_total{model, cache_hit}; each stale-cache refresh is traced.

9. Observability (ML-specific)

MetricTypePurpose
chan_ml_inference_duration_secondsHistogramPer-model latency
chan_ml_inference_total{model, cache_hit}CounterCall volume + cache effectiveness
chan_ml_budget_exceeded_total{model}CounterFallbacks triggered
chan_ml_prediction_qualityGaugeRolling 24 h AUC vs offline eval
chan_ml_feature_drift_psi{feature}GaugeDrift per feature
chan_ml_fallback_rateGaugeRatio of decisions using static fallback (target ≤ 5%)

Alerts:

  • ChannelMlLatencyHighinference_duration P95 > 20 ms for 10 min.
  • ChannelMlFeatureDriftPSI > 0.2 for top-5 features.
  • ChannelMlFallbackRateHighfallback_rate > 15% for 30 min.

10. Governance

  • Model cards maintained in docs/ml/model-cards/channel-preference-v1.md (data sources, bias analysis, performance tiers).
  • ML retrospective every quarter — Messaging Core + Platform ML review aggregate impact on delivery-success-rate.
  • Tenant opt-out: any tenant may set fallback_policies.ml_ordering_enabled = false to disable ML-assisted ordering for their traffic.

11. Future work (not in v1)

  • Multi-armed-bandit for active exploration on UNSEEN recipients.
  • Voice-call timing optimisation (best time-of-day predictor).
  • Template-language matching (learn recipient-preferred Pashto vs Dari from prior interactions).

All future work must preserve the on-prem, data-sovereign, fail-closed posture established in v1.