AI Integration
:::info Source
Sourced from services/assignment-service/AI_INTEGRATION.md in the documentation repo.
:::
Companion: 03 ai-gateway-service · SERVICE_OVERVIEW (Slice S5)
1. Scope
Assignment-service never talks to an LLM directly. All AI calls are routed through ai-gateway-service, which provides routing, prompt-registry lookup, cost accounting, PII redaction, provider fallback, and traceability.
Two AI-augmented features:
| Feature | Slice | Trigger |
|---|---|---|
| Suggested Assignments | S5 | POST /api/v1/assignments/suggest |
| Explain Compliance Gap | S5 | Query-time augmentation of compliance report |
Both features are human-in-the-loop — no action is taken without an admin reviewing and accepting the proposal.
2. Prompt Registry
Prompts live in @ghasi/prompt-registry, keyed by promptId:
| promptId | Purpose | Version at M4 |
|---|---|---|
assignment/suggest | Given tenant context → propose assignment(s) | 1.0.0 |
assignment/explain_gap | Explain why an OU has N open windows | 1.0.0 |
assignment/target_suggest | Given course → propose target groups | 0.2.0 (experimental) |
Every prompt ships:
- System prompt (immutable per version)
- Few-shot examples
- Output JSON schema (Zod on our side, matching)
- Min / max token, temperature caps
- Permitted model tiers (typically
standardorpremium)
3. Feature: Suggested Assignments
3.1 Input
interface SuggestContext {
tenantId: TenantId;
orgUnitId?: OrgUnitId;
roleId?: RoleId;
complianceFramework?: 'JCI'|'HIPAA'|'GDPR'|'ISO27001'|'OSHA'|string;
lookbackDays?: number; // default 365
priorCompletions?: CompletionSnapshot[]; // small synopsis
availableCourses: Array<{
courseId: CourseId;
title: I18nString;
tags: string[];
duration: ISODuration;
}>;
}
3.2 Call path
admin → POST /assignments/suggest
→ AssignmentService.build_context_from_tenant_projection()
→ AI Gateway.invoke(promptId='assignment/suggest', input=ctx, model='auto')
→ Gateway applies: prompt caching, PII redaction, cost meter, fallback chain
→ returns { proposal, rationale } + AIProvenance
→ Assignment Service validates proposal via Zod schema
→ stores proposal under pending_ai_suggestion table (TTL 7d)
→ returns to admin UI
3.3 Output contract (strict)
interface SuggestOutput {
proposal: CreateAssignmentCommand; // MUST pass full Zod validation
rationale: I18nString;
confidence: number; // 0..1
alternatives?: Array<{ proposal: CreateAssignmentCommand; rationale: I18nString }>;
}
If AI returns an invalid proposal → we return HTTP 502 ai/bad-proposal and log a regression sample.
3.4 HITL flow
No auto-activation. The admin:
- Sees the proposal pre-filled in the authoring UI.
- Adjusts any field.
- Calls the standard
POST /api/v1/assignments(which we markaiSuggested=trueand attachaiProvenanceusing the stored suggestion id). - Activates separately.
pending_ai_suggestion → assignment.ai_provenance link confirms the lineage.
4. Feature: Explain Compliance Gap
Query-time call, not saga-driven.
admin requests /compliance-report?explainGap=true
→ we produce structured compliance summary
→ AI Gateway.invoke(promptId='assignment/explain_gap', input=summary)
→ returns I18n explanation + suggested remediation actions
→ returned to admin alongside report
Outputs are informational only — never auto-create assignments.
5. AIProvenance Capture
Every AI-touched entity carries AIProvenance:
{
model: 'claude-sonnet-4-20250514',
version: 'gw-routed',
promptId: 'assignment/suggest',
promptVersion: '1.0.0',
traceId: '01HXYZ…',
decisionId: 'dec_…',
local: false,
generatedAt: '2026-04-15T…',
reviewedBy: 'usr_admin_42',
reviewedAt: '2026-04-15T…',
cost: { microUSD: 18200, tokens: { in: 1320, out: 642 } }
}
reviewedBy / reviewedAt populated when admin activates the suggestion.
6. Cost & Rate
- Budget: ≤ $1.00 per successful
suggestcall at p95 (tracked by Gateway). - Rate limit: 100 suggest calls / day / tenant (tunable).
explain_gapcost: ≤ $0.05 p95 (smaller context).
7. Safety & Governance
- No PII in prompts. User IDs hashed to stable per-tenant tokens. Names/emails removed by Gateway redactor.
- Bounded outputs. Max 10 proposed target groups, max 3 alternative proposals, max 2000 output tokens.
- Tenant opt-out. If tenant policy has
aiFeatures.suggestionsEnabled=false, the endpoint returns 403ai/tenant-disabled. - Model allow-list. Per tenant, controlled by AI Gateway.
- Prompt-injection defense. All tenant text concatenated into the prompt is wrapped in delimiter markers and Gateway's injection classifier is run pre-call.
- No chained autonomy. AI cannot call
activate. Period.
8. On-Device AI (future)
Reserved for M6+: lightweight on-device models used for admin productivity hints when offline (e.g., "similar courses in your tenant"). Not in S5.
9. Evaluation
- Golden-set prompt eval in CI: ≥ 30 labelled
(context, expected proposal)pairs. Pass criterion: 90% of outputs must produce a proposal that validates AND scores ≥ 0.7 cosine similarity against golden ontitle,targets,rrule. - Regression harness runs weekly in staging.
10. Observability
Each AI call produces:
- OTel trace with
ai.gateway.callspan - Metrics:
assignment.ai.suggest.count,…latency,…cost_micro_usd,…validation_failure - Log: structured with
traceId,promptId,promptVersion,model,status
11. Failure Modes
| Failure | Handling |
|---|---|
| Gateway down | Return 503; admin UI degrades to manual authoring |
| Model returns malformed JSON | Gateway already validates; on pass-through failure → 502 with support id |
| Cost budget exceeded | Gateway rejects with 402; we surface friendly error |
| Prompt-injection detected | Gateway returns 422; we log + alert |
| Tenant AI disabled | 403 at our layer before calling Gateway |