iam-service — Service Overview
Catalog summary:
docs/03-microservices/iam-service.md· Strategic refs: 02 Enterprise Architecture · 07 Security & Tenancy · ADR-0003 Electron Offline · Standards · NAMING · Standards · ERROR_CODES
1. Purpose
iam-service is the single source of truth for principal identity on the Melmastoon hotel SaaS platform. It owns:
- Who an authenticated principal is (User aggregate + credentials).
- How they proved it (sessions, MFA factors, federated identity, device binding).
- What signed token they hold (access JWT) and how it rotates (refresh family).
- Which programmatic identities exist (API keys).
It does not decide what a principal can do. Authorization (role assignments, property scoping, feature flags, billing entitlements) is enforced by the calling service — typically against claims minted by iam-service and enriched by tenant-service at token-mint time.
2. Bounded Context
| Field | Value |
|---|---|
| Bounded context | Identity & Access |
| Subdomain type | Generic (no business differentiation; commodity capability) |
| Strategic patterns | Open Host Service (JWKS) · Conformist (consumers conform to JWT claim contract) · Customer/Supplier with tenant-service |
| Bounded context map | iam-service ── (publishes JWT) ──▶ all services · iam-service ◀── (membership lifecycle) ── tenant-service |
| Ubiquitous language | Principal, Credential, Session, Refresh family, Device binding, Offline certificate, AMR, JIT provision, Step-up MFA |
3. Responsibilities (in scope)
| # | Responsibility | Detail |
|---|---|---|
| 1 | Account lifecycle | Register, verify email, lock, unlock, anonymize (GDPR). |
| 2 | Credential management | argon2id hashing, breach-list check, rotation history, password reset. |
| 3 | Session lifecycle | Issue access JWT + rotating refresh; revoke; family-revoke on reuse. |
| 4 | MFA enrollment + challenge | TOTP, WebAuthn, recovery codes; adaptive challenge. |
| 5 | Device registration | Per-user fingerprint + Ed25519 public key; trust toggle. |
| 6 | Offline binding | Issue tenant-CA-signed device certificate (≤ 7 d) for Electron desktop. |
| 7 | Federated identity | OIDC + SAML 2.0; JIT user provisioning; nonce/state validation. |
| 8 | Magic link | Email + single-use nonce login (guests, password recovery). |
| 9 | API key issuance | Hashed, scoped, tenant-bound; rotation; revoke. |
| 10 | JWKS publication | Public-key JSON Web Key Set; CDN-cached; rotation overlap. |
| 11 | Audit | Append-only audit log of every security-sensitive event. |
| 12 | GDPR participation | Erasure saga participant; ACK within 7 d. |
4. Non-Responsibilities (explicitly out of scope)
| # | Concern | Owner |
|---|---|---|
| 1 | User profile data (display name, avatar, locale, contact) | tenant-service |
| 2 | Role assignments / RBAC matrices | tenant-service |
| 3 | Property / room access scoping | tenant-service + property-service |
| 4 | Organizational hierarchy (chain → brand → property) | tenant-service |
| 5 | Authorization decisions (can_user_X_do_Y) | each calling service (uses claims) |
| 6 | Billing-related identity (Stripe customer, invoice contact) | billing-service |
| 7 | Notification preferences / channels | notification-service |
| 8 | Audit search / SIEM queries | audit-service (consumes our events) |
5. Dependencies
5.1 Upstream (we depend on)
| Dependency | Relationship | Failure handling |
|---|---|---|
| Cloud KMS (regional) | Synchronous — JWT signing | In-memory cached key (5 min TTL); short outage tolerated. |
| Cloud SQL (Postgres HA) | Synchronous — read/write | Read replica fallback for JWKS / session validate. |
| Memorystore (Redis HA) | Synchronous — session cache, rate limit, nonce | Postgres fallback (degraded latency). |
| Pub/Sub | Asynchronous — outbox publish | Outbox table buffers; retry with backoff. |
| OIDC IdPs (Google, Microsoft, custom) | Synchronous — SSO callback | Circuit breaker; fallback to local password. |
| SAML IdPs (enterprise) | Synchronous — SSO callback | Circuit breaker; cached metadata. |
| HIBP / breach-list provider | Async — k-anonymity API | Skip on outage; record audit entry. |
tenant-service | Asynchronous — consume membership events | Outbox + replay; eventual consistency. |
| Secret Manager | Synchronous on boot | Cached in memory; rotated via SIGHUP. |
| SMTP (transactional) | Asynchronous via notification-service | Queue-and-retry; never block login. |
5.2 Downstream (depend on us)
| Consumer | What they consume | Coupling |
|---|---|---|
| Every service | JWT iss + JWKS for token verification | Conformist (CF). Breaking JWT shape = platform-wide outage. |
tenant-service | melmastoon.iam.user.registered.v1, …locked.v1, …erased.v1 | Saga partner; provisions / disables membership. |
notification-service | melmastoon.iam.password.reset_requested.v1, …magic_link.requested.v1 | Sends transactional email/SMS. |
audit-service | All melmastoon.iam.* events (regulated retention) | Append-only ingest. |
sync-service | melmastoon.iam.device.bound_for_offline.v1, …session.revoked.v1 | Drives offline bundle (un)provision. |
bff-backoffice | Login / refresh / device endpoints | Direct REST. |
bff-tenant-booking | Login / register / magic-link | Direct REST. |
bff-consumer | Guest magic-link, social SSO | Direct REST. |
| Electron desktop | Login + device-bind + offline cert | REST + sync. |
| Tenant mobile app | Login + WebAuthn + push MFA | REST. |
ai-orchestrator-service | Adaptive-MFA risk classification (we are consumer) | OHS via internal mTLS. |
6. Architecture Diagram (8 sections)
┌──────────────────────────┐
│ 1. Edge / API Gateway │
│ (Cloud Armor + WAF) │
│ rate-limit, geo, anti-bot│
└────────────┬─────────────┘
│
┌──────────────────┴──────────────────┐
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ 2. Presentation │ │ 8. JWKS Edge │
│ NestJS controllers │ │ /.well-known/jwks.json│
│ /api/v1/auth/* │ │ CDN, 5 min TTL │
└──────────┬───────────┘ └──────────┬───────────┘
│ │
▼ │
┌──────────────────────┐ │
│ 3. Application │ │
│ Use cases: │ │
│ RegisterUser │ │
│ LoginWithPassword │ │
│ LoginWithSSO │ │
│ RotateRefreshToken │ │
│ EnrollMFA / Bind │ │
└──────────┬───────────┘ │
│ │
▼ │
┌──────────────────────┐ │
│ 4. Domain (pure TS) │ │
│ User, Credential, │ │
│ Session, Device, │ │
│ MFAFactor, APIKey, │ │
│ ExternalIdentity │ │
└──────────┬───────────┘ │
│ │
▼ │
┌──────────────────────┐ │
│ 5. Infrastructure │ │
│ Adapters: PgRepo, │ │
│ RedisCache, KMSSigner│ │
│ OIDC/SAML clients, │ │
│ MailerPort, │ │
│ AIClassifierPort │ │
└──────────┬───────────┘ │
│ │
┌──────────┼─────────────────────────────────────┤
▼ ▼ ▼ ▼ ▼
┌──────┐ ┌──────────┐ ┌─────────┐ ┌────────┐ ┌──────────┐
│Cloud │ │ Redis │ │ KMS │ │ Pub/Sub│ │ External │
│ SQL │ │(sessions │ │ (sign) │ │(outbox)│ │ IdPs │
│ +RLS │ │ + nonce) │ │ │ │ │ │OIDC/SAML │
└──────┘ └──────────┘ └─────────┘ └────────┘ └──────────┘
▲ │
│ 6. Outbox Relay (worker) │
│ ──────────────────────────────
│ reads outbox → publishes Pub/Sub
│
│ 7. Inbox Consumer (worker)
│ ──────────────────────────
│ subscribes to melmastoon.tenant.*
│ applies side effects (provision, revoke)
▼
Audit log (append-only)
7. Key Design Decisions
| # | Decision | Rationale | ADR |
|---|---|---|---|
| D-01 | Identity ≠ Profile. Only auth-side projection lives here. | Avoid the "god user table" anti-pattern; lets tenant-service evolve role/property model independently. | ADR-0001 |
| D-02 | EdDSA Ed25519 for JWT signing. | Smaller signatures, faster verify, no parameter pitfalls vs RSA-PSS. | This service. |
| D-03 | Rotating refresh tokens with reuse detection. | Detects token theft within one rotation cycle (≤ access-TTL). | RFC 6819 §5.2.2.3. |
| D-04 | argon2id for password & API key hashing. Params: m=64MB, t=3, p=1. | Memory-hard; OWASP 2026 baseline. | SECURITY_MODEL §5. |
| D-05 | Device-bound offline certificate (Electron). Ed25519 device key, signed by tenant CA, ≤ 7 days. | Lets backoffice run offline without server reachability while keeping revocation possible on next sync. | ADR-0003 |
| D-06 | JWT carries tid (single home tenant) + tids[] (cross-tenant operators). | Clean tenant scoping for staff and chain operators alike. | This service. |
| D-07 | All login failures emit melmastoon.iam.user.login_failed.v1. | Single event for SIEM regardless of failure reason; reason is a payload field. | This service. |
| D-08 | No JWT denylist; rely on short access TTL + immediate refresh-token revocation. | Keeps verification stateless and cheap. | Accepted risk in SERVICE_RISK_REGISTER. |
| D-09 | Pluggable identity backend behind ports (in-house, Keycloak, Firebase). | Allows enterprise tenants to bring own IdP without code changes. | This service. |
| D-10 | All write endpoints require Idempotency-Key. | Network retries are common at the edge; double-charge / double-register is unacceptable. | docs/05-api-design.md. |
| D-11 | Adaptive MFA via ai-orchestrator-service only. No direct model in this service. | Centralized AI gateway, provenance, budget control. | Standards. |
| D-12 | Magic-link nonce in Redis with 10-min TTL + single-use. | Replay-resistant; works for guests with no password. | This service. |
8. Hotel-Specific Concerns
| # | Concern | Resolution |
|---|---|---|
| 1 | Front-desk shift handover. Multiple staff may use the same physical workstation. | Each user authenticates with their own credentials; access JWT TTL is 15 min; lock-screen on idle ≥ 5 min triggers re-auth (handled by Electron shell). |
| 2 | Housekeeping mobile devices (cheap Android tablets). | WebAuthn fallback to TOTP; push-MFA via notification-service; device binding optional but encouraged. |
| 3 | Property in low-connectivity area. | Electron desktop offline cert (≤ 7 d) signed by tenant CA; refresh works against local cert; no server roundtrip for token rotation. |
| 4 | Chain operator logs into multiple properties. | JWT tids[] lists every authorized tenant; tid is the active home; bff-backoffice switches tid via /api/v1/auth/switch-tenant. |
| 5 | Guest direct booking. | Magic-link or social SSO; no password required; userType='guest'; cannot device-bind, cannot acquire API keys. |
| 6 | Platform admin (Melmastoon ops). | userType='platform_admin', tenant-less, MFA mandatory, hardware key required from M2. |
| 7 | PMS migration. Importing tenants bring legacy staff accounts. | CSV importer (Phase 2); maps legacy email → new User; forces password reset on first login. See MIGRATION_PLAN. |
| 8 | Right-to-be-forgotten for past hotel guests. | GDPR erasure saga purges identity rows; reservation history retained as anonymized projection by reservation-service. |
| 9 | Lock integration. Smart-lock services (lock-service) need short-lived API keys per property. | Issued by iam-service, scope lock:dispense, tenant + property bound, 24-h TTL with auto-rotate. See 09 Lock Integration. |
| 10 | Multi-language UI. Pashto / Dari / English / Arabic users. | iam-service is locale-agnostic; error codes (MELMASTOON.IAM.*) carry userMessageKey resolved by frontend i18n bundle. |
9. Service Tier & Tier-Aligned Targets
| Dimension | Target |
|---|---|
| Service tier | T0 (platform-critical) |
| Availability SLO | 99.99% (≤ 4.38 min downtime / month) |
| JWKS availability | 99.999% |
| Login latency p99 | < 800 ms |
| Refresh latency p95 | < 100 ms |
| Recovery time objective (RTO) | 60 min cross-region |
| Recovery point objective (RPO) | 5 min |
| On-call rotation | 24×7 follow-the-sun |
10. Cross-Bundle Index
See deep-bundle index in docs/03-microservices/iam-service.md.