regulator-portal-service — AI Integration

Version: 1.0 Status: Draft Owner: Regulator-facing + Legal Last Updated: 2026-04-21 Companion: APPLICATION_LOGIC UC-05 IngestComplaint · SECURITY_MODEL · ../compliance-engine/AI_INTEGRATION.md

1. Scope

AI is minimally scoped in regulator-portal-service. Only one use case at launch:

#	Use case	Status	Model	Provider
AI-01	Citizen-complaint triage classification	Opt-in, feature-flagged	On-prem classifier (fine-tuned BERT-class model)	Local cluster

Everything else is off limits: no AI is used on LI data, no cloud LLM is used on regulator-facing content, and no generative text is produced. The service is a regulatory system — determinism and auditability dominate.

2. AI-01 — Complaint Triage Classification

2.1 Purpose

When ATRA forwards a citizen complaint, the summary text is typically unstructured Dari/Pashto/English narrative. A lightweight classifier pre-categorises the complaint to the same set as ComplaintType so triage reviewers see the likely category surfaced.

The classifier never decides the final category — it only provides a hint with confidence. The triage reviewer (an admin on the admin-dashboard workbench) confirms or overrides.

2.2 Model and Hosting

Attribute	Value
Model class	Multilingual BERT-class (e.g. XLM-RoBERTa-base, ~270M params)
Fine-tuning	On a labelled corpus of ≥ 2,000 historical complaints + synthetic generation
Hosting	On-prem `complaint-triage-classifier` deployment (CPU-only, 2 replicas, 1 GB RAM each)
Inference latency	P95 ≤ 200 ms per call
Language support	English, Dari (fa-AF), Pashto (ps), Arabic (ar)
Interface	HTTP `POST /classify` returns `{ category, confidence, alternatives: [{category, confidence}] }`

No cloud LLM is used for LI data. Complaint text itself is moderately sensitive (citizen narrative), so on-prem classifier is the only permitted path. External LLM is hard-disabled for this service — the start-up guard refuses to boot if AI_EXTERNAL_PROVIDER_ALLOWED is set.

2.3 Input / Output

Input to classifier:

{
  "summary": "Received 20 marketing messages from ACMEBANK in one day",
  "language": "en"
}

Output:

{
  "category": "UNSOLICITED_SMS",
  "confidence": 0.91,
  "alternatives": [
    { "category": "SENDER_ID_ABUSE", "confidence": 0.06 },
    { "category": "OTHER", "confidence": 0.03 }
  ],
  "modelVersion": "triage-xlmr-v1.2.0"
}

Output is stored in regulator.complaints.triage_ai_category + triage_ai_confidence.

2.4 Feature Flag

FEATURE_COMPLAINT_TRIAGE_AI (default false). When disabled, complaints arrive without a pre-classification and reviewers work from scratch.

When enabled, the flag may be per-region disabled for sovereignty constraints.

3. PII Handling

Citizen MSISDN is NEVER sent to the classifier. The summary field is the only input.
Summary is sanitised pre-inference — regex strips:
- Phone numbers (\+?\d[\d\s\-]{6,}\d → [PHONE])
- Monetary amounts (\b\d[\d,]*(\.\d+)?\s?(AFN|USD|EUR|afs)\b → [AMOUNT])
- 5+ digit sequences → [NUMERIC]
- Email addresses → [EMAIL]
No storage of inputs in classifier. The classifier service logs only modelVersion, inference latency, and output confidence — never the input text.
Cap on exposure. summary field is bounded to 4,000 chars per DB constraint; classifier truncates to 512 tokens (model max) and discards the rest.

4. Human-in-the-Loop (HITL) Flow

All classifications are advisory. The workflow:

Complaint ingested → classifier returns (category, confidence, alternatives).
Fields stored in DB; event regulator.complaint.received.v1 carries triageAiCategory + triageAiConfidence.
Triage reviewer sees the complaint in admin-dashboard with the suggestion pre-filled.
Reviewer confirms, overrides, or escalates.
Reviewer's final category is persisted to complaints.complaint_type (overrides AI if different). The original AI category remains in triage_ai_category for audit.
If reviewer overrides frequently for a category (tracked as override_rate), Trust & Safety is notified — input to model retraining.

Threshold guidance (not enforced, advisory):

Confidence ≥ 0.90: reviewer expected to confirm in one click (low friction)
0.60–0.89: reviewer reviews alternatives
< 0.60: reviewer treats as un-triaged

5. Moderation Policy

The classifier only emits one of the declared ComplaintType values — no free-text output. This eliminates prompt-injection / content-generation risk entirely.

A response parser rejects any output that does not match the expected schema; a parse failure is treated as a classifier failure and the complaint is flagged for manual triage.

6. Model Lifecycle

Task	Owner	Cadence
Training dataset curation	Trust & Safety + Legal	Monthly review
Re-training	ML Platform	Quarterly
Bias audit (category distribution by language, region, sender-type)	Trust & Safety	Quarterly
Precision/recall audit against held-out labelled set	Trust & Safety	Monthly
Model version upgrade	ML Platform	Quarterly; A/B with 10% shadow rollout before promote

Bias/fairness safeguards:

Stratified evaluation set covers all ComplaintType values × language × region.
Reviewer override rate tracked per category; >25% override rate triggers retraining.
Explicit fairness probe set for potential sensitive categories (no category should correlate with a specific tenant or region beyond 3-sigma of prior).

7. AIProvenance Touch Points

Each classification writes a provenance record (in-memory, not persisted per-request but summarised in event + DB):

interface AIProvenance {
  modelVersion: string;          // triage-xlmr-v1.2.0
  inferredAt: string;            // RFC 3339
  inputSanitisationRules: string[];  // ['phone','amount','numeric','email']
  confidence: number;
  alternatives: Array<{ category: string; confidence: number }>;
  hostPod: string;
  latencyMs: number;
}

The regulator.complaint.received.v1 event carries triageAiCategory + triageAiConfidence + modelVersion; reviewers and auditors can trace a verdict back to a model version.

8. Failure Modes

Failure	Behaviour
Classifier service unreachable	Complaint ingested without AI fields; reviewer triages from scratch; alert `RegulatorTriageClassifierDown` fires
Classifier returns low confidence (< 0.60) on all categories	Stored with `triage_ai_category = 'OTHER'` (default), confidence value recorded
Classifier returns malformed response	Parse failure recorded; complaint ingested without AI fields
Classifier latency > 500 ms	Fire-and-forget — the ingest path does not block on AI; AI fields populated asynchronously within 60 s
Model drift detected	Trust & Safety pauses the feature flag until retrained

9. Regulator-Facing Disclosure

The monthly compliance summary report includes an "AI assistance" section disclosing:

That an on-prem triage classifier is used
Model version
Aggregate reviewer-override rate (transparency metric)

No AI model touches LI data. No AI model touches report generation (reports are deterministic aggregations of upstream data). This is stated explicitly in each report's methodology section — important for regulator trust.

10. Future Enhancements (not in scope at launch)

Enhancement	Status	Timeline
Multilingual summarisation of complaint clusters (spotting coordinated citizen complaints)	Backlog	2027
Auto-drafting of complaint resolution letters for reviewer approval	Backlog (HITL-mandatory)	2027+
Anomaly detection on complaint volume patterns	Backlog	2027

All future AI enhancements remain subject to on-prem-only constraint and HITL review. LI-adjacent AI is explicitly off the roadmap.

1. Scope​

2. AI-01 — Complaint Triage Classification​

2.1 Purpose​

2.2 Model and Hosting​

2.3 Input / Output​

2.4 Feature Flag​

3. PII Handling​

4. Human-in-the-Loop (HITL) Flow​

5. Moderation Policy​

6. Model Lifecycle​

7. AIProvenance Touch Points​

8. Failure Modes​

9. Regulator-Facing Disclosure​

10. Future Enhancements (not in scope at launch)​