Skip to main content

Security

:::info Source Sourced from services/identity-service/SECURITY_MODEL.md in the documentation repo. :::

Companion: 13 Security, Compliance & Tenancy · DOMAIN_MODEL · API_CONTRACTS

Identity-service is the authoritative authentication authority of the platform. A security failure here cascades across all 18 other services. This document catalogs the threat model, enforcement mechanisms, and operational practices.

1. Standards & Compliance

StandardConformance Level
OAuth 2.1Full
OIDC (OpenID Connect Core 1.0)Full
WebAuthn Level 2Full (credential registration + authentication)
SAML 2.0Full (SP role; IdP-initiated and SP-initiated)
OWASP ASVSLevel 2 platform-wide; Level 3 for identity (auth, sessions, credentials)
NIST SP 800-63BMeets AAL2; AAL3 optional per tenant via WebAuthn + platform authenticator
PCI DSSN/A (no cardholder data); billing-service handles payments
SOC 2Type 2 target; identity-service is in scope

2. Threat Model

ThreatImpactLikelihoodMitigation
Credential stuffingAccount takeover at scaleHighargon2id + lockout + adaptive MFA + breach-list checking
Password sprayingLow-and-slow account compromiseMediumIP-level rate limits + lockout after common-password patterns
Session hijack (token theft)Full account accessMediumShort access JWT (15 min), rotating refresh, family revocation on reuse
Phishing-harvested credentialsAccount takeoverHighMFA required; WebAuthn resistant to phishing
OAuth authorization code theftAccount takeover via SSO callbackLowPKCE mandatory, state parameter with signed/encrypted binding
SAML assertion replayUnauthorized sessionLowInResponseTo matching, timestamp window, single-use RequestID
JWT signing key compromiseForged tokens platform-wideCatastrophicKMS-backed keys, rotation with kid, JWKS refresh, emergency rotation playbook
Refresh token database dumpLong-lived impersonationHighTokens stored hashed; hash uniqueness prevents direct use
MFA bypass via downgradeAccount takeoverMediumMFA required flags persisted; step-up requirements enforced server-side
Device fingerprint collisionCross-user device bindingLowFingerprint includes public key; uniqueness enforced per-user
Offline certificate forgeryOffline impersonationCatastrophicCA key HSM-stored; certificates short-lived (90d)
SSO provider compromiseAccount takeover via JITMediumTenant-scoped provider allowlist; JIT requires email verification on sensitive operations
API key leak (repo, logs)Scoped tenant accessHighKeys hashed at rest; prefix-only in logs; rotation APIs; key scanners on git
Account enumerationReconnaissanceMediumConstant-time responses; generic messaging on register/reset
Timing attacks on password verifyCredential leakageLowargon2id is constant-time by design; wrapper ensures no branches
Race on concurrent loginSession confusionLowOptimistic concurrency; session family ID reconciles
GDPR erasure bypassCompliance violationLowSaga-driven erasure with mandatory acknowledgement
Insider abuse (platform admin)Full breachMediumJust-in-time elevation, four-eyes on sensitive ops, audit log with Merkle anchoring

3. Authentication Flows

3.1 Password Login Flow

Client identity-service Postgres KMS
│ │ │ │
├── POST /login ─────────►│ │ │
│ ├── load user ─────────►│ │
│ │◄──────────────────────┤ │
│ ├── check status │ │
│ ├── check lockout │ │
│ ├── verify argon2id │ │
│ ├── adaptive MFA check │ │
│ │ │ │
│ │── (MFA required) ──┐ │
│◄── 401 mfa_required ────│ │ │
│ │ │ │
│── MFA challenge ───────►│ │ │
│ ├── verify TOTP │ │
│ │ │ │
│ ├── create session ────►│ │
│ ├── sign JWT ──────────────────────► │
│ │◄─────────────────────────────────── │
│ ├── outbox: logged_in ─►│ │
│◄── 200 {tokens} ────────│ │ │

3.2 SSO Flow (OIDC)

Client identity-service IdP Redis
│ │ │ │
├── /sso/start ──────►│ │ │
│ ├── generate PKCE │
│ ├── store state ──────────────► │
│ ├── build auth URL │
│◄── 302 redirect ────│ │
│ │
├── GET IdP auth endpoint ───────────────►│ │
│◄── user authenticates ──────────────────┤ │
│◄── 302 callback with code ──────────────┤ │
│ │
├── /sso/callback?code&state ───►│ │
│ ├── validate state ────────────►│
│ ├── exchange code for tokens ───────►│ (IdP)
│ │◄──────────────────────────────────│
│ ├── validate ID token
│ ├── find/create user
│ ├── create session
│◄── 302 with tokens ─│

3.3 Refresh Token Rotation

Client identity-service Postgres
│ │ │
├── /refresh(rt)──►│ │
│ ├── hash rt
│ ├── lookup session ─►│
│ │◄───────────────────┤
│ │
│ (case: current hash matches)
│ ├── generate new rt
│ ├── add old hash to previous_token_hashes
│ ├── update session.refresh_token_hash
│ ├── sign new JWT
│◄── 200 {new tokens}──┤

│ (case: rt matches previous_token_hashes = REUSE)
│ ├── revoke all sessions in family
│ ├── emit session.revoked (rotation_reuse) for each
│ ├── audit.warn
│◄── 401 rotation_reuse─┤

4. JWT Signing & Key Management

4.1 Key Hierarchy

┌─────────────────────────────────────┐
│ KMS Root Key (HSM-backed) │ never exposed
└─────────────────────────────────────┘
│ wraps

┌─────────────────────────────────────┐
│ Identity CA Key (JWT signer + Device cert CA) │
│ - EdDSA Ed25519 │
│ - generated in KMS │
│ - kid: idsvc-{year}-{rotation} │
└─────────────────────────────────────┘
│ signs

┌─────────┴─────────┐
│ │
▼ ▼
┌─────────┐ ┌────────────┐
│ JWTs │ │ Device CA │
└─────────┘ │ Certificate│
└────────────┘

4.2 Key Lifecycle

PhaseActionCadence
CreateKMS-generated Ed25519 keypair; public exported to JWKSPer rotation
ActiveSigns new JWTs; also verifies existing ones90 days
SunsetStill in JWKS for verification; no longer signs new tokens30 days
ExpiredRemoved from JWKS; no tokens signed with it accepted

4.3 JWKS

Published at /.well-known/jwks.json:

  • Includes all active and sunset keys.
  • Cached with Cache-Control: public, max-age=3600, stale-while-revalidate=86400.
  • Consumers (all other services) fetch on startup and every hour; tolerate up to 24h staleness on failure.

4.4 Emergency Rotation

Triggered by: suspected key compromise, insider incident, KMS security event.

Procedure:

  1. Platform-admin triggers emergency rotation.
  2. New key generated in KMS; kid bumped with -emrg- suffix.
  3. New key activated immediately; old key marked sunset.
  4. Grace window: 15 minutes (not 30 days) — all old tokens expire.
  5. All active sessions revoked; all users re-authenticate.
  6. Post-mortem within 72h; audit review.

5. Password Policy

RuleValueEnforcement
Minimum length12 charactersValidated at change
Character complexity1 uppercase + 1 lowercase + 1 digit + 1 specialRegex validation
Breach-list checkHaveIBeenPwned via k-anonymityBlocked if found
Similarity to emailNot equal, not substring, not reversedCustom checker
HistoryCannot reuse last 5rotation_history array
Common-password listBlocked against top 10k passwordsIn-memory bloom filter
Max length128 charactersPrevents DoS on hashing

Hashing: argon2id with m=64MiB, t=3, p=1, salt=16 bytes, hash=32 bytes. Parameters chosen to target ~100ms on production-tier hardware. Review annually; raise cost when hardware allows.

No rotation mandate: NIST SP 800-63B guidance — forced periodic rotation degrades security. Rotation is user-initiated or triggered by compromise detection.

6. Session Management

6.1 Access Token

  • Format: JWT (EdDSA Ed25519)
  • Lifetime: 15 minutes
  • Storage: client memory (never localStorage on web)
  • Transport: Authorization: Bearer <token>
  • Audience: ghasi-platform
  • Revocation: implicit via short TTL; immediate revocation via session revocation + client logout

6.2 Refresh Token

  • Format: opaque random 256-bit (base64url)
  • Lifetime: 30 days (sliding)
  • Storage: HTTP-only, Secure, SameSite=Strict cookie OR mobile keychain
  • Rotation: single-use; rotated on every refresh
  • Revocation: explicit via logout OR session family revocation on reuse detection
  • Transport: refresh endpoint only, never in URL

6.3 Session Limits

  • Max 10 active sessions per user (configurable per tenant).
  • On overflow, oldest session revoked.
  • Admin can force-revoke any session.

6.4 Session Revocation Triggers

TriggerAction
User logoutRevoke current session
Password changeRevoke all sessions
MFA changeRevoke all sessions
Device revocationRevoke sessions bound to that device
Admin wipeRevoke all sessions for target user
Rotation reuse detectedRevoke entire session family
Security incidentRevoke all sessions for affected users

7. MFA & Step-Up Authentication

7.1 Factor Strength (ordered)

  1. WebAuthn platform authenticator (strongest; phishing-resistant)
  2. WebAuthn roaming authenticator (hardware key)
  3. TOTP
  4. Recovery codes (one-time use)
  5. SMS (deprecated for sensitive scopes)

7.2 Step-Up Requirements

OperationRequired Factor Strength
Normal loginAdaptive (baseline pwd; MFA if risky)
Password changeRe-auth within 5 minutes
MFA factor add/removeRe-auth within 5 minutes + existing factor
Device revocationRe-auth within 5 minutes
API key creationRe-auth within 5 minutes
Billing changes (tenant-service)Re-auth within 5 minutes
Admin actionsRe-auth within 2 minutes + MFA

Step-up challenge recorded in audit log with decisionId.

7.3 Adaptive MFA

See AI_INTEGRATION for the risk classification layer.

8. Device Binding

8.1 Registration Flow

  1. Client generates Ed25519 keypair on device (secure enclave where available).
  2. Client sends public key + fingerprint to POST /users/me/devices.
  3. Server creates Device aggregate; trusted: false initially.
  4. User verifies possession via existing channel (email or in-app confirmation).
  5. trustedAt set.

8.2 Offline Binding Certificate

Certificate is X.509 format:

  • Subject: CN=dev_01HN..., O=user-{userId}, OU=tenant-{tenantId}
  • Issuer: CN=Ghasi Identity CA, kid=idsvc-2026-01
  • Public Key: device public key
  • Not Before: issuance time
  • Not After: issuance + 90 days
  • Extensions: key usage = digital signature + key encipherment

Signed with CA key via KMS. Never exposed in plaintext beyond the device.

8.3 Revocation

  • Device can be revoked by user or admin.
  • Revocation is propagated via next sync pull.
  • Offline-cached bundles encrypted for a revoked device remain mountable until bundle's own license envelope expires OR device certificate expires.
  • Hard revocation (security incident) includes out-of-band push to sync gateway if device is online.

9. API Key Security

  • Raw key format: gk_{env}_{base62-random-40} where env ∈ {live, test}.
  • Shown exactly once at creation.
  • Stored as SHA-256 hash.
  • Prefix (first 8 chars) displayed in UI and logs for identification.
  • Scopes validated against creator's granted permissions; cannot escalate.
  • Tenant-scoped; cannot access cross-tenant resources.
  • Automatic rotation reminders after 90 days (non-blocking).
  • Git-scanner integration: on push, scan for leaked prefixes; revoke immediately if found.

10. Audit Logging

Every security-sensitive event is appended to audit_log:

Event CategoryExamples
AuthenticationLogin success/fail, MFA challenge, logout
Credential changesPassword change, MFA factor add/remove
Session changesSession revoke, rotation reuse
Device changesRegister, trust, revoke, offline bind
API key changesIssue, revoke, unauthorized use
Admin actionsLock, unlock, force logout, erasure
Policy violationsBreach-list hit, weak password attempt

Audit logs are:

  • Append-only (policy enforced).
  • Daily Merkle-anchored; root hash emitted as audit.merkle.anchored.v1.
  • PII-scrubbed in operational logs; full PII only in dedicated audit store with restricted access.
  • Retained 7 years per regulatory class.

11. Network Security

  • TLS 1.3 only; HSTS preload; certificate pinning for mobile clients.
  • mTLS for inter-service calls inside the cluster (service mesh).
  • WAF rules: OWASP CRS + identity-specific (credential stuffing patterns, OIDC abuse).
  • Geo controls: per-tenant allowlists/blocklists for login origin.
  • Rate limits: per IP, per user, per endpoint — enforced via Redis token bucket.

12. Security Headers

All HTTP responses include:

Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Referrer-Policy: no-referrer
Content-Security-Policy: default-src 'none'; frame-ancestors 'none'
Permissions-Policy: camera=(), microphone=(), geolocation=()

13. Secret Management

SecretStorageRotation
JWT signing keysKMS/HSM90 days
Device CA signing keyKMS/HSM2 years
Database passwordVault30 days (automated)
NATS credentialsVault30 days
Argon2 pepper (optional)Vault1 year
HMAC secrets (webhooks)Vault90 days
SSO provider client secretsVault per-tenantTenant-driven

No secrets in:

  • Source code
  • Environment variables of running containers (read from Vault/KMS at startup)
  • Logs
  • Stack traces

14. Incident Response

ScenarioPlaybook
Suspected JWT key compromiseEmergency rotation; revoke all sessions; notify
Mass credential stuffing detectedElevate rate limits; force MFA platform-wide temporarily
Database breachInvalidate all refresh tokens; force password reset; notify
KMS service outageFallback to cached keys for verification only; halt new token issuance
Individual account compromiseLock account; revoke sessions; notify user; investigate

Runbooks live in docs/runbooks/identity/ (see LOCAL_DEV_SETUP for links).