Skip to main content

iam-service — AI Integration

Catalog summary: docs/03-microservices/iam-service.md · 02 §13 AI Architecture · SECURITY_MODEL · APPLICATION_LOGIC

iam-service itself is deterministic — login, MFA, signing — no LLM or model in the hot path of credential verification. AI is used adjacent to identity for risk scoring and anomaly detection. All model calls go through the platform ai-orchestrator-service; no model SDK is bundled in iam-service.

1. Use Cases

#Use caseTriggerDecision leverModel class
1Adaptive MFA (login risk)Every successful password verifyServer forces MFA challenge if risk score ≥ 0.6Tabular boosted-tree (Vertex AI tabular)
2Credential-stuffing burst detectionAggregated failed logins in 5-min windowAuto-lock IP/CIDR; raise WAF ruleStreaming anomaly detector
3Impossible-travel detectionNew session geo + speedForce step-up; alert tenant_adminGeo heuristic + classifier
4Stolen-device patternRefresh from new IP + new UA + same device keyTrigger family revoke + re-authTabular classifier
5Magic-link abuseHigh volume of magic-link requests for same emailRate-limit + temporary blockHeuristic with classifier fallback
6Account-recovery social engineeringReset request soon after profile changesAdd HITL gate (tenant_admin must approve)Pattern detector
7Edge anomaly (offline desktop)On reconnectFlag suspicious offline activity for reviewONNX Runtime Node (edge)

Use cases 1–6 run server-side via ai-orchestrator-service. Use case 7 runs locally on the Electron desktop using ONNX Runtime Node.

2. Flow — Adaptive MFA

iam-service ai-orchestrator-service Vertex AI
│ │ │
│ buildLoginContext() │ │
│ { tenantId, userId, ipMasked, ua, │ │
│ deviceId, recentFailures, geo, │ │
│ knownGoodIPs[], lastLoginAt } │ │
│ │ │
├── POST /v1/risk/login.classify ──────►│ │
│ ├── enrich (HIBP, geoip) │
│ ├── invoke model ────────►│
│ │◄── score, reasons[] │
│◄── { score: 0.78, │ │
│ reasons:["new_device", │ │
│ "atypical_geo"], │ │
│ modelVersion:"login-risk-1.4", │ │
│ provenance:{...}, │ │
│ hitlSuggested:false } │ │
│ │
│ if score ≥ tenantThreshold → mfa_required

tenantThreshold is provisioned by tenant-service per tenant policy. Default 0.6.

3. Provenance Metadata

Every AI-influenced decision is attached to the corresponding domain event under payload.aiProvenance:

interface AIProvenance {
decision: 'mfa_required' | 'allow' | 'lock' | 'review';
modelId: string; // 'login-risk'
modelVersion: string; // semver, e.g., '1.4.0'
modelHash: string; // sha256
promptId?: string; // n/a for tabular
inputDigest: string; // sha256 of normalised features
score: number;
reasons: string[];
classifiedAt: string; // RFC 3339
orchestratorRequestId: string;
hitlGate: 'none' | 'required' | 'optional';
}

Stored in:

  • melmastoon.iam.user.login_succeeded.v1 payload (when MFA was forced)
  • melmastoon.iam.user.locked.v1 payload (when AI caused the lock)
  • iam.audit_events.metadata.aiProvenance

4. HITL (Human-In-The-Loop) Gates

A subset of AI suggestions are never auto-applied:

SuggestionAuto?HITL escalation
Force MFA challengeyesn/a (challenge is reversible by user)
Lock account due to credential stuffingyesn/a (user can recover via reset)
Lock account due to insider/social-engineering patternnoSuggestion routed to tenant_admin via notification-service; manual confirm required.
Issue offline binding cert when risk > 0.4noTenant_admin must explicitly approve.
Auto-revoke API keynoSuggestion only; tenant_admin must revoke.

HITL queues live in workflow-orchestration-service. iam-service emits melmastoon.iam.suggestion.created.v1 (security retention) and waits for melmastoon.workflow.iam_suggestion.approved.v1 / …rejected.v1.

5. Edge AI (Electron desktop)

ONNX Runtime Node loads models/login-risk-edge-v1.onnx (≤ 5 MB). Used while offline to:

  • Flag improbable local activity (e.g., 5 staff logins from 3 PCs in 10 s).
  • Compute a local risk score for buffered audit events.

The local score is advisory only; server's score on reconnect is authoritative. Edge model artefacts are signed and pinned by version; the desktop will refuse to load an unsigned model. See 02 §13.4.

6. Data Sent to AI Orchestrator

iam-service MUST minimise PII. The risk-classification request contains:

FieldPII?Notes
tenantIdnoopaque
userIdpseudonymousopaque
ipMaskedpartial/24 v4 or /48 v6 — never raw IP
userAgentlowhashed (SHA-256)
deviceIdpseudonymousopaque
geoCountryCodelowISO-3166
recentFailedAttemptsnonumeric
lastSuccessfulLoginAtnotimestamp

primary_email, password hash, MFA secrets, refresh tokens — never sent.

7. Model Lifecycle Contract

StageRequirement
TrainingDone by Data Science via analytics-service BigQuery export of anonymised events.
EvaluationF1 ≥ baseline; FPR ≤ 5%; documented with red-team adversarial set.
PromotionCanary 5% of risk-classify traffic for 7 d; auto-rollback on FPR > 7%.
DeprecationTwo-version overlap; old modelVersion allowed for 30 d in provenance.

8. Failure Behaviour

FailureBehaviour
ai-orchestrator-service timeout > 200 msSkip AI; fall back to rule-based MFA decision (`new_device
Orchestrator HTTP 5xxSame fallback; circuit breaker after 5 consecutive failures (Memorystore counter).
Edge model corruptDisable local risk; keep auth flowing; emit melmastoon.iam.edge_model.degraded.v1 (operational).

iam-service MUST never block authentication on AI failure. Identity availability dominates risk-precision.

9. Compliance & Auditability

  • All AI features documented as Article 22 GDPR (automated decision-making) with human review available (HITL).
  • Tenant Admin UI exposes "AI explainer": shows reasons[] and lets the user contest a forced lock — opens a HITL ticket.
  • Provenance retained for 7 years alongside the audit event (regulated retention class).

10. Cross-References