iam-service — Security Model
Catalog · 07 Security, Compliance & Tenancy · DATA_MODEL · DOMAIN_MODEL · SYNC_CONTRACT
iam-service is the highest-blast-radius service on the platform: a compromise here lets an attacker mint valid JWTs for any tenant. This document defines the cryptography, access controls, audit posture, and incident playbooks.
| Standard | Conformance Level |
|---|
| OAuth 2.1 (draft) | Full |
| OIDC 1.0 (Core, Discovery, Dynamic Client Registration where supported) | Core, Discovery |
| WebAuthn Level 2 / FIDO2 | Full |
| SAML 2.0 (Web Browser SSO) | Full |
| OWASP ASVS 4.0 | L2 baseline; auth + sessions at L3 |
| NIST SP 800-63B | AAL2 default; AAL3 for platform_admin |
| ISO 27001 / SOC 2 Type II | Mapped controls; reviewed annually |
| GDPR Art. 25 / 32 | Data minimization; encryption in transit + at rest |
| EU AI Act (high-risk) | HITL on AI-suggested locks (see AI_INTEGRATION §9) |
2. Threat Model (excerpt)
| ID | Threat | STRIDE | Likelihood | Impact | Mitigation |
|---|
| T-01 | Credential stuffing | Spoofing | High | High | argon2id; per-IP+per-email lockout; WAF; adaptive MFA. |
| T-02 | Refresh-token theft via XSS / malware | Spoofing | Medium | Critical | Rotating refresh; reuse detection → family revoke; short access TTL (15 min). |
| T-03 | KMS misconfig → JWT forgery | Tampering | Low | Catastrophic | KMS IAM least-privilege; deploy-time config drift detector; quarterly DR drill. |
| T-04 | SAML XSW (XML signature wrapping) | Tampering | Medium | Critical | Certified library; signature → schema → attribute order; clock skew ≤ 5 min; InResponseTo. |
| T-05 | OIDC state/nonce replay | Spoofing | Medium | High | State = HMAC(sessionId, tenantId); nonce single-use in Redis. |
| T-06 | Account enumeration via reset | Information disclosure | Medium | Medium | Constant-time response; identical 202 envelope; timing tests in CI. |
| T-07 | Device fingerprint spoof → offline cert misissue | Spoofing | Low | High | Fingerprint = HMAC(tenantSecret, attrs); offline bind requires user re-auth + Ed25519 keypair from device. |
| T-08 | API key leak in logs / repos | Information disclosure | Medium | Critical | Hashed (argon2id); only 8-char prefix logged; secret-scan in CI. |
| T-09 | Lockout DoS | DoS | High | Medium | IP-scoped lockout when IP is new; magic-link self-recovery; admin-lift endpoint. |
| T-10 | JIT-provisioned spam users via SSO | Spoofing | Medium | Medium | Tenant policy gate on JIT; CAPTCHA on guest registration. |
| T-11 | Outbox replay → duplicate event | Tampering | Low | Medium | Consumer dedup by eventId; outbox idempotent. |
| T-12 | Breach-list lookup over plaintext | Information disclosure | High | High | k-anonymity (HIBP API: only first 5 SHA-1 chars sent). |
| T-13 | Session hijack on shared workstation | Spoofing | Medium | High | Idle lock 5 min; device binding; tenant policy can require re-auth on tenant-switch. |
| T-14 | Magic-link interception via email forwarding | Spoofing | Low | High | 10-min TTL; single-use; bound to issuing IP/UA optionally; warning template. |
Full STRIDE table maintained in security/threat-model.md; reviewed quarterly + on every story touching auth.
3. Authentication Surfaces
| Surface | Access Method | AAL | Notes |
|---|
/api/v1/auth/login (password) | password | AAL1 base; AAL2 with MFA | Adaptive challenge promotes to AAL2 on risk. |
/api/v1/auth/sso/oidc/* | OIDC | depends on IdP acr | Honors IdP amr; tenant policy may demand local MFA on top. |
/api/v1/auth/sso/saml/* | SAML 2.0 | depends on IdP | Same. |
/api/v1/auth/webauthn/* | WebAuthn (FIDO2) | AAL3 with attested authenticator | Required for platform_admin. |
/api/v1/auth/magic-link/* | one-time email link | AAL1 | Guests only by default; tenant can opt-in for staff. |
/api/v1/users/me/devices/*/bind-offline | bearer access JWT + acr=fresh-auth | AAL2 minimum | + tenant CA signature on issued cert. |
Internal /internal/* | mTLS SPIFFE | service identity | Never user-callable. |
4. JWT Crypto
| Aspect | Choice |
|---|
| Algorithm | EdDSA (Ed25519) |
| Signing key location | Cloud KMS (HSM-backed); never extractable. |
| Key hierarchy | Root signing key (per region) → year-monthly kid aliases (e.g. iam-2026-04). |
| Rotation cadence | Monthly scheduled; emergency on demand. |
| Rotation overlap | ≥ 2 days; both kids in JWKS during overlap. |
| Verification | Consumer pulls JWKS via CDN (5-min TTL); jittered cache. |
| Token TTL | Access: 900 s · Refresh (online): 30 d · Refresh (offline-bound): 7 d |
jti | ULID; required claim |
| Clock skew tolerance | 60 s |
4.1 Emergency rotation
Documented in runbooks/iam/jwt-emergency-rotation.md. Steps: mint new kid → publish JWKS → rotate signing alias → wait 2× cache TTL → revoke compromised kid → mass session revoke (reason='admin_revoke').
5. Credential Storage
| Concern | Choice |
|---|
| Algorithm | argon2id |
| Parameters | m=64MB, t=3, p=1 (verified annually against OWASP latest) |
| Salt | 16-byte random per credential |
| Pepper | None at hash level (defense-in-depth via DB envelope encryption) |
| Hash bytes envelope | Encrypted at rest via Cloud SQL CMEK + transparent disk encryption |
| Rehash on login | If hash_version < current → recompute and update in-place |
| History | Last 5 hashes retained (rotation history check) |
| Password policy | Length ≥ 12, ≥ 3 character classes, no email substring, no breach hit |
| Breach check | HIBP k-anonymity API on registration + reset; ON every successful login is checked monthly via background job (pwn_audit) |
6. Refresh Token Storage
| Aspect | Choice |
|---|
| Token format | 256-bit opaque, base64url encoded, prefixed rft_ |
| At rest | SHA-256 hash stored in sessions.current_token_hash; previous N (default 5) in previous_token_hashes |
| Reuse detection | Presented hash matches a previous → entire family_id revoked, event melmastoon.iam.session.revoked.v1{reason='rotation_reuse'} |
| Transport | Authorization: Bearer (HTTPS only); HttpOnly cookie permitted as alt for browser clients |
| Lifetime | 30 d online; 7 d if did present and offline-binding active |
| Revocation | Immediate on logout / lock / tenant delete / device revoke / password change |
7. MFA Crypto
| Factor | Storage | Notes |
|---|
| TOTP | Secret envelope-encrypted via KMS DEK; secret_kid recorded for rotation | RFC 6238; SHA-1 (default) or SHA-256, 30 s, 6 or 8 digits. |
| WebAuthn | credential_id + public_key + sign_count stored; aaguid for attestation policy | Sign count monotonic — regression triggers clone-detected event. |
| Recovery codes | Each code stored as separate argon2id hash; single-use bit flipped | Bundle of 10; regenerate invalidates all. |
| SMS | Destination number stored; deprecated for staff/platform; permitted only for guest M0/M1. | Out-of-band (notification-service); rate-limited. |
8. Device Binding Crypto
| Aspect | Choice |
|---|
| Device keypair | Ed25519, generated on-device (Electron uses Node crypto.generateKeyPair('ed25519')); private key stored in OS keychain (DPAPI / Keychain / libsecret). |
| CSR | PKCS#10, includes Ed25519 public key, common name did:dev:<DeviceId>, SAN with tenantId + userId. |
| Cert format | X.509 v3 PEM. |
| Signing CA | Per-tenant intermediate CA in Cloud KMS, signed by Melmastoon platform root CA. |
| Cert TTL | ≤ 7 d (tenant-policy bounded). |
| Revocation | DB flag + sync delta (no external CRL/OCSP — tenant CA is private). |
| Renewal | T-24h pre-expiry trigger in Electron; rolls binding atomically. |
8.1 Tenant CA bootstrap
Melmastoon Platform Root CA (KMS, ED25519, root key, 30y)
│
▼
Tenant CA — ten_01HZ8X (KMS, ED25519, intermediate, 5y)
│
▼
Device Cert — dev_01HZ… (issued by tenant CA, ≤ 7d)
Created at melmastoon.tenant.created.v1 consumption; rotated annually with overlap.
9. API Key Crypto
| Aspect | Choice |
|---|
| Key format | mlk_<26 base32> (issued ULID) + _<random suffix>; never log raw |
| Storage | argon2id hash + 8-char prefix (for log search) |
| HMAC pepper | Tenant-specific HMAC pepper key_hash_kid for additional defense |
| Validation | Look up by prefix → argon2id verify; reject expired/revoked |
| Denylist | Redis iam:apikey:denylist:<prefix> 60-min TTL bridges replication lag after revoke |
10. Magic-Link Crypto
| Aspect | Choice |
|---|
| Token | 256-bit random + ULID, format ml_<26 base32> |
| Storage | Server-side: password_reset_requests.token_hash (sha256) + Redis iam:magic:<sha> |
| TTL | 10 min |
| Single-use | Redis DEL on redeem; idempotency table catches replay |
| Binding | Optionally bound to issuing IP/UA (tenant policy) |
11. RBAC / ABAC Matrix (within iam scope)
iam-service exposes a small authorization surface (most authz is downstream). The matrix below covers iam endpoints only.
| Role | Endpoint | Allowed |
|---|
| anonymous | /auth/register, /auth/login, /auth/refresh, /auth/sso/*, /auth/magic-link/*, /auth/password/reset/*, /.well-known/jwks.json | ✅ |
| user (any) | /users/me/*, /auth/logout, /auth/mfa/*, /auth/devices/* | ✅ self only |
tenant_admin | /users/{id}/lock, /users/{id}/unlock | ✅ within tenant |
tenant_admin | /users/{id}/devices/* | ✅ within tenant (read + revoke) |
platform_admin | All admin routes | ✅ platform-wide |
| guest | /auth/devices/{id}/bind-offline, /api-keys | ❌ (platform-wide policy) |
API key (scope=admin:*) | /users/{id}/lock | ✅ if tenant matches key's tenant |
ABAC checks: tid == requested.tenantId for tenant-scoped admin actions; userType in {staff, platform_admin} for any privileged surface.
12. Audit Logging
Every security-sensitive action writes one row to iam.audit_events. Append-only enforced by trigger.
| Action | When |
|---|
iam.user.registered | Always |
iam.user.login_succeeded, iam.user.login_failed | Always |
iam.user.locked, …unlocked | Always |
iam.session.refreshed, …revoked | Always |
iam.password.reset_requested, …completed, iam.password.changed | Always |
iam.mfa.enrolled, …removed | Always |
iam.device.registered, …trusted, …bound_for_offline, …revoked | Always |
iam.apikey.issued, …revoked, …unauthorized_use | Always |
iam.external_identity.linked, …unlinked | Always |
iam.admin.lock, …unlock, …force_logout | Always; carries actor userId |
iam.gdpr.erasure_completed | Always |
iam.kms.signing_key_rotated | Always; carries kid |
Audit search is performed by audit-service (consumes events). Direct querying of iam.audit_events is restricted to platform compliance role.
13. Audit Retention
| Retention class | Window |
|---|
regulated (e.g. user.registered, mfa_enrolled, device.bound_for_offline, apikey.issued) | 7 years |
security (failures, locks, revocations) | 1 year primary + 6 years cold |
operational (refreshes, success logins) | 90 days primary + analytical tier |
| Audit table partitions | Monthly; archived to BigQuery; tenant-erasure-aware purge job |
14. GDPR Participation
| Right | Behavior |
|---|
| Right of access (Art. 15) | iam-service exposes /users/me/snapshot and /admin/users/{id}/snapshot; aggregated by gdpr-service into the DSAR PDF. |
| Right to rectification (Art. 16) | email change is staff-supported; emits …email_changed.v1 (out of M0 scope). |
| Right to erasure (Art. 17) | Saga participant; on melmastoon.tenant.guest.erasure_requested.v1 we anonymize User.primary_email to gdpr-erased-<userId>@example.invalid, delete credentials/sessions/devices/MFA/API-keys/external-identities; emit melmastoon.iam.user.erased.v1. Audit log retained as legal-hold (Recital 65 / Art. 17(3)(b)). |
| Right to portability (Art. 20) | iam-service portion is minimal (login records); merged into export by gdpr-service. |
| Right to object (Art. 21) | n/a for security processing (Art. 6(1)(f)). |
15. Network Security
| Layer | Control |
|---|
| Transport | TLS 1.3 only; HSTS max-age=63072000; includeSubDomains; preload. |
| mTLS | Internal mesh (SPIFFE id spiffe://melmastoon/prod/iam-service); enforced by service mesh. |
| WAF | Cloud Armor + custom credential-stuffing signatures. |
| Geo controls | Per-tenant allowlist/denylist (OFAC + tenant config). |
| Anti-bot | Turnstile/reCAPTCHA on register + magic-link request. |
| Rate limits (edge) | /auth/login 10/min/IP; /auth/password/reset/request 3/h/email; /auth/refresh 60/min/family. |
Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
X-Content-Type-Options: nosniff
Referrer-Policy: no-referrer
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
Cross-Origin-Resource-Policy: same-site
Permissions-Policy: camera=(), microphone=(), geolocation=()
Content-Security-Policy: default-src 'none'; frame-ancestors 'none'
Cache-Control: no-store
17. Secret Management
| Secret | Storage | Rotation |
|---|
| JWT signing key | Cloud KMS (HSM) | Monthly; emergency on demand |
| Tenant CA root | Cloud KMS | Annual |
| OIDC client secrets | Secret Manager | Quarterly or per IdP cadence |
| SAML signing key | Cloud KMS | Annual |
| HIBP API key | Secret Manager | Per provider cadence |
| SMTP creds | Secret Manager | Annual |
| Tenant fingerprint HMAC secret | Cloud KMS DEK | Annual |
| API-key HMAC pepper | Cloud KMS DEK | Annual |
| Magic-link signing | KMS-derived ephemeral; no long-lived secret | n/a |
No secrets in env files committed to the repo. CI uses workload identity federation; local dev uses .env.local (gitignored).
18. Incident Response Playbooks
| Scenario | Runbook |
|---|
| Suspected JWT forgery | runbooks/iam/jwt-forgery.md |
| KMS outage | runbooks/iam/kms-outage.md |
| Mass credential stuffing | runbooks/iam/credential-stuffing.md |
| Refresh-token theft suspected | runbooks/iam/token-theft.md |
| SAML metadata drift | runbooks/iam/saml-drift.md |
| GDPR erasure stuck | runbooks/iam/gdpr-stuck.md |
| Device-cert mass expiry | runbooks/iam/device-cert-expiry.md |
| Breach-list provider down | runbooks/iam/hibp-down.md |
Each runbook includes: detection, triage, mitigation, comms, recovery, postmortem template.
19. Penetration Testing
| Cadence | Scope |
|---|
| Annual external pen test | Full iam surface incl. SSO + WebAuthn |
| Quarterly internal red team | Credential stuffing, token theft, SSO substitution, side-channel timing |
| Per-release | OWASP ZAP automated scan; SAST (Semgrep); SCA (Trivy / Snyk) |
| Pre-prod | DAST against staging |
Findings tracked in security/findings/; CRITICAL must be closed pre-release.
20. Backward Compatibility (Crypto)
argon2id parameters may be hardened over time; rehash-on-login pattern (see MIGRATION_PLAN §6) keeps users on current params.
- Ed25519 → future curve migration: new device bindings use new curve; existing bindings remain valid until expiry; bundles encrypted with key derived from existing public key remain decryptable.
- JWT
kid rotation never invalidates a session (only the access token's kid; refresh is opaque).