Skip to main content

AI Integration

:::info Source Sourced from services/assignment-service/AI_INTEGRATION.md in the documentation repo. :::

Companion: 03 ai-gateway-service · SERVICE_OVERVIEW (Slice S5)


1. Scope

Assignment-service never talks to an LLM directly. All AI calls are routed through ai-gateway-service, which provides routing, prompt-registry lookup, cost accounting, PII redaction, provider fallback, and traceability.

Two AI-augmented features:

FeatureSliceTrigger
Suggested AssignmentsS5POST /api/v1/assignments/suggest
Explain Compliance GapS5Query-time augmentation of compliance report

Both features are human-in-the-loop — no action is taken without an admin reviewing and accepting the proposal.

2. Prompt Registry

Prompts live in @ghasi/prompt-registry, keyed by promptId:

promptIdPurposeVersion at M4
assignment/suggestGiven tenant context → propose assignment(s)1.0.0
assignment/explain_gapExplain why an OU has N open windows1.0.0
assignment/target_suggestGiven course → propose target groups0.2.0 (experimental)

Every prompt ships:

  • System prompt (immutable per version)
  • Few-shot examples
  • Output JSON schema (Zod on our side, matching)
  • Min / max token, temperature caps
  • Permitted model tiers (typically standard or premium)

3. Feature: Suggested Assignments

3.1 Input

interface SuggestContext {
tenantId: TenantId;
orgUnitId?: OrgUnitId;
roleId?: RoleId;
complianceFramework?: 'JCI'|'HIPAA'|'GDPR'|'ISO27001'|'OSHA'|string;
lookbackDays?: number; // default 365
priorCompletions?: CompletionSnapshot[]; // small synopsis
availableCourses: Array<{
courseId: CourseId;
title: I18nString;
tags: string[];
duration: ISODuration;
}>;
}

3.2 Call path

admin → POST /assignments/suggest
→ AssignmentService.build_context_from_tenant_projection()
→ AI Gateway.invoke(promptId='assignment/suggest', input=ctx, model='auto')
→ Gateway applies: prompt caching, PII redaction, cost meter, fallback chain
→ returns { proposal, rationale } + AIProvenance
→ Assignment Service validates proposal via Zod schema
→ stores proposal under pending_ai_suggestion table (TTL 7d)
→ returns to admin UI

3.3 Output contract (strict)

interface SuggestOutput {
proposal: CreateAssignmentCommand; // MUST pass full Zod validation
rationale: I18nString;
confidence: number; // 0..1
alternatives?: Array<{ proposal: CreateAssignmentCommand; rationale: I18nString }>;
}

If AI returns an invalid proposal → we return HTTP 502 ai/bad-proposal and log a regression sample.

3.4 HITL flow

No auto-activation. The admin:

  1. Sees the proposal pre-filled in the authoring UI.
  2. Adjusts any field.
  3. Calls the standard POST /api/v1/assignments (which we mark aiSuggested=true and attach aiProvenance using the stored suggestion id).
  4. Activates separately.

pending_ai_suggestionassignment.ai_provenance link confirms the lineage.

4. Feature: Explain Compliance Gap

Query-time call, not saga-driven.

admin requests /compliance-report?explainGap=true
→ we produce structured compliance summary
→ AI Gateway.invoke(promptId='assignment/explain_gap', input=summary)
→ returns I18n explanation + suggested remediation actions
→ returned to admin alongside report

Outputs are informational only — never auto-create assignments.

5. AIProvenance Capture

Every AI-touched entity carries AIProvenance:

{
model: 'claude-sonnet-4-20250514',
version: 'gw-routed',
promptId: 'assignment/suggest',
promptVersion: '1.0.0',
traceId: '01HXYZ…',
decisionId: 'dec_…',
local: false,
generatedAt: '2026-04-15T…',
reviewedBy: 'usr_admin_42',
reviewedAt: '2026-04-15T…',
cost: { microUSD: 18200, tokens: { in: 1320, out: 642 } }
}

reviewedBy / reviewedAt populated when admin activates the suggestion.

6. Cost & Rate

  • Budget: ≤ $1.00 per successful suggest call at p95 (tracked by Gateway).
  • Rate limit: 100 suggest calls / day / tenant (tunable).
  • explain_gap cost: ≤ $0.05 p95 (smaller context).

7. Safety & Governance

  • No PII in prompts. User IDs hashed to stable per-tenant tokens. Names/emails removed by Gateway redactor.
  • Bounded outputs. Max 10 proposed target groups, max 3 alternative proposals, max 2000 output tokens.
  • Tenant opt-out. If tenant policy has aiFeatures.suggestionsEnabled=false, the endpoint returns 403 ai/tenant-disabled.
  • Model allow-list. Per tenant, controlled by AI Gateway.
  • Prompt-injection defense. All tenant text concatenated into the prompt is wrapped in delimiter markers and Gateway's injection classifier is run pre-call.
  • No chained autonomy. AI cannot call activate. Period.

8. On-Device AI (future)

Reserved for M6+: lightweight on-device models used for admin productivity hints when offline (e.g., "similar courses in your tenant"). Not in S5.

9. Evaluation

  • Golden-set prompt eval in CI: ≥ 30 labelled (context, expected proposal) pairs. Pass criterion: 90% of outputs must produce a proposal that validates AND scores ≥ 0.7 cosine similarity against golden on title, targets, rrule.
  • Regression harness runs weekly in staging.

10. Observability

Each AI call produces:

  • OTel trace with ai.gateway.call span
  • Metrics: assignment.ai.suggest.count, …latency, …cost_micro_usd, …validation_failure
  • Log: structured with traceId, promptId, promptVersion, model, status

11. Failure Modes

FailureHandling
Gateway downReturn 503; admin UI degrades to manual authoring
Model returns malformed JSONGateway already validates; on pass-through failure → 502 with support id
Cost budget exceededGateway rejects with 402; we surface friendly error
Prompt-injection detectedGateway returns 422; we log + alert
Tenant AI disabled403 at our layer before calling Gateway