Skip to main content

iam-service — Service Overview

Catalog summary: docs/03-microservices/iam-service.md · Strategic refs: 02 Enterprise Architecture · 07 Security & Tenancy · ADR-0003 Electron Offline · Standards · NAMING · Standards · ERROR_CODES

1. Purpose

iam-service is the single source of truth for principal identity on the Melmastoon hotel SaaS platform. It owns:

  • Who an authenticated principal is (User aggregate + credentials).
  • How they proved it (sessions, MFA factors, federated identity, device binding).
  • What signed token they hold (access JWT) and how it rotates (refresh family).
  • Which programmatic identities exist (API keys).

It does not decide what a principal can do. Authorization (role assignments, property scoping, feature flags, billing entitlements) is enforced by the calling service — typically against claims minted by iam-service and enriched by tenant-service at token-mint time.

2. Bounded Context

FieldValue
Bounded contextIdentity & Access
Subdomain typeGeneric (no business differentiation; commodity capability)
Strategic patternsOpen Host Service (JWKS) · Conformist (consumers conform to JWT claim contract) · Customer/Supplier with tenant-service
Bounded context mapiam-service ── (publishes JWT) ──▶ all services · iam-service ◀── (membership lifecycle) ── tenant-service
Ubiquitous languagePrincipal, Credential, Session, Refresh family, Device binding, Offline certificate, AMR, JIT provision, Step-up MFA

3. Responsibilities (in scope)

#ResponsibilityDetail
1Account lifecycleRegister, verify email, lock, unlock, anonymize (GDPR).
2Credential managementargon2id hashing, breach-list check, rotation history, password reset.
3Session lifecycleIssue access JWT + rotating refresh; revoke; family-revoke on reuse.
4MFA enrollment + challengeTOTP, WebAuthn, recovery codes; adaptive challenge.
5Device registrationPer-user fingerprint + Ed25519 public key; trust toggle.
6Offline bindingIssue tenant-CA-signed device certificate (≤ 7 d) for Electron desktop.
7Federated identityOIDC + SAML 2.0; JIT user provisioning; nonce/state validation.
8Magic linkEmail + single-use nonce login (guests, password recovery).
9API key issuanceHashed, scoped, tenant-bound; rotation; revoke.
10JWKS publicationPublic-key JSON Web Key Set; CDN-cached; rotation overlap.
11AuditAppend-only audit log of every security-sensitive event.
12GDPR participationErasure saga participant; ACK within 7 d.

4. Non-Responsibilities (explicitly out of scope)

#ConcernOwner
1User profile data (display name, avatar, locale, contact)tenant-service
2Role assignments / RBAC matricestenant-service
3Property / room access scopingtenant-service + property-service
4Organizational hierarchy (chain → brand → property)tenant-service
5Authorization decisions (can_user_X_do_Y)each calling service (uses claims)
6Billing-related identity (Stripe customer, invoice contact)billing-service
7Notification preferences / channelsnotification-service
8Audit search / SIEM queriesaudit-service (consumes our events)

5. Dependencies

5.1 Upstream (we depend on)

DependencyRelationshipFailure handling
Cloud KMS (regional)Synchronous — JWT signingIn-memory cached key (5 min TTL); short outage tolerated.
Cloud SQL (Postgres HA)Synchronous — read/writeRead replica fallback for JWKS / session validate.
Memorystore (Redis HA)Synchronous — session cache, rate limit, noncePostgres fallback (degraded latency).
Pub/SubAsynchronous — outbox publishOutbox table buffers; retry with backoff.
OIDC IdPs (Google, Microsoft, custom)Synchronous — SSO callbackCircuit breaker; fallback to local password.
SAML IdPs (enterprise)Synchronous — SSO callbackCircuit breaker; cached metadata.
HIBP / breach-list providerAsync — k-anonymity APISkip on outage; record audit entry.
tenant-serviceAsynchronous — consume membership eventsOutbox + replay; eventual consistency.
Secret ManagerSynchronous on bootCached in memory; rotated via SIGHUP.
SMTP (transactional)Asynchronous via notification-serviceQueue-and-retry; never block login.

5.2 Downstream (depend on us)

ConsumerWhat they consumeCoupling
Every serviceJWT iss + JWKS for token verificationConformist (CF). Breaking JWT shape = platform-wide outage.
tenant-servicemelmastoon.iam.user.registered.v1, …locked.v1, …erased.v1Saga partner; provisions / disables membership.
notification-servicemelmastoon.iam.password.reset_requested.v1, …magic_link.requested.v1Sends transactional email/SMS.
audit-serviceAll melmastoon.iam.* events (regulated retention)Append-only ingest.
sync-servicemelmastoon.iam.device.bound_for_offline.v1, …session.revoked.v1Drives offline bundle (un)provision.
bff-backofficeLogin / refresh / device endpointsDirect REST.
bff-tenant-bookingLogin / register / magic-linkDirect REST.
bff-consumerGuest magic-link, social SSODirect REST.
Electron desktopLogin + device-bind + offline certREST + sync.
Tenant mobile appLogin + WebAuthn + push MFAREST.
ai-orchestrator-serviceAdaptive-MFA risk classification (we are consumer)OHS via internal mTLS.

6. Architecture Diagram (8 sections)

┌──────────────────────────┐
│ 1. Edge / API Gateway │
│ (Cloud Armor + WAF) │
│ rate-limit, geo, anti-bot│
└────────────┬─────────────┘

┌──────────────────┴──────────────────┐
▼ ▼
┌──────────────────────┐ ┌──────────────────────┐
│ 2. Presentation │ │ 8. JWKS Edge │
│ NestJS controllers │ │ /.well-known/jwks.json│
│ /api/v1/auth/* │ │ CDN, 5 min TTL │
└──────────┬───────────┘ └──────────┬───────────┘
│ │
▼ │
┌──────────────────────┐ │
│ 3. Application │ │
│ Use cases: │ │
│ RegisterUser │ │
│ LoginWithPassword │ │
│ LoginWithSSO │ │
│ RotateRefreshToken │ │
│ EnrollMFA / Bind │ │
└──────────┬───────────┘ │
│ │
▼ │
┌──────────────────────┐ │
│ 4. Domain (pure TS) │ │
│ User, Credential, │ │
│ Session, Device, │ │
│ MFAFactor, APIKey, │ │
│ ExternalIdentity │ │
└──────────┬───────────┘ │
│ │
▼ │
┌──────────────────────┐ │
│ 5. Infrastructure │ │
│ Adapters: PgRepo, │ │
│ RedisCache, KMSSigner│ │
│ OIDC/SAML clients, │ │
│ MailerPort, │ │
│ AIClassifierPort │ │
└──────────┬───────────┘ │
│ │
┌──────────┼─────────────────────────────────────┤
▼ ▼ ▼ ▼ ▼
┌──────┐ ┌──────────┐ ┌─────────┐ ┌────────┐ ┌──────────┐
│Cloud │ │ Redis │ │ KMS │ │ Pub/Sub│ │ External │
│ SQL │ │(sessions │ │ (sign) │ │(outbox)│ │ IdPs │
│ +RLS │ │ + nonce) │ │ │ │ │ │OIDC/SAML │
└──────┘ └──────────┘ └─────────┘ └────────┘ └──────────┘
▲ │
│ 6. Outbox Relay (worker) │
│ ──────────────────────────────
│ reads outbox → publishes Pub/Sub

│ 7. Inbox Consumer (worker)
│ ──────────────────────────
│ subscribes to melmastoon.tenant.*
│ applies side effects (provision, revoke)

Audit log (append-only)

7. Key Design Decisions

#DecisionRationaleADR
D-01Identity ≠ Profile. Only auth-side projection lives here.Avoid the "god user table" anti-pattern; lets tenant-service evolve role/property model independently.ADR-0001
D-02EdDSA Ed25519 for JWT signing.Smaller signatures, faster verify, no parameter pitfalls vs RSA-PSS.This service.
D-03Rotating refresh tokens with reuse detection.Detects token theft within one rotation cycle (≤ access-TTL).RFC 6819 §5.2.2.3.
D-04argon2id for password & API key hashing. Params: m=64MB, t=3, p=1.Memory-hard; OWASP 2026 baseline.SECURITY_MODEL §5.
D-05Device-bound offline certificate (Electron). Ed25519 device key, signed by tenant CA, ≤ 7 days.Lets backoffice run offline without server reachability while keeping revocation possible on next sync.ADR-0003
D-06JWT carries tid (single home tenant) + tids[] (cross-tenant operators).Clean tenant scoping for staff and chain operators alike.This service.
D-07All login failures emit melmastoon.iam.user.login_failed.v1.Single event for SIEM regardless of failure reason; reason is a payload field.This service.
D-08No JWT denylist; rely on short access TTL + immediate refresh-token revocation.Keeps verification stateless and cheap.Accepted risk in SERVICE_RISK_REGISTER.
D-09Pluggable identity backend behind ports (in-house, Keycloak, Firebase).Allows enterprise tenants to bring own IdP without code changes.This service.
D-10All write endpoints require Idempotency-Key.Network retries are common at the edge; double-charge / double-register is unacceptable.docs/05-api-design.md.
D-11Adaptive MFA via ai-orchestrator-service only. No direct model in this service.Centralized AI gateway, provenance, budget control.Standards.
D-12Magic-link nonce in Redis with 10-min TTL + single-use.Replay-resistant; works for guests with no password.This service.

8. Hotel-Specific Concerns

#ConcernResolution
1Front-desk shift handover. Multiple staff may use the same physical workstation.Each user authenticates with their own credentials; access JWT TTL is 15 min; lock-screen on idle ≥ 5 min triggers re-auth (handled by Electron shell).
2Housekeeping mobile devices (cheap Android tablets).WebAuthn fallback to TOTP; push-MFA via notification-service; device binding optional but encouraged.
3Property in low-connectivity area.Electron desktop offline cert (≤ 7 d) signed by tenant CA; refresh works against local cert; no server roundtrip for token rotation.
4Chain operator logs into multiple properties.JWT tids[] lists every authorized tenant; tid is the active home; bff-backoffice switches tid via /api/v1/auth/switch-tenant.
5Guest direct booking.Magic-link or social SSO; no password required; userType='guest'; cannot device-bind, cannot acquire API keys.
6Platform admin (Melmastoon ops).userType='platform_admin', tenant-less, MFA mandatory, hardware key required from M2.
7PMS migration. Importing tenants bring legacy staff accounts.CSV importer (Phase 2); maps legacy email → new User; forces password reset on first login. See MIGRATION_PLAN.
8Right-to-be-forgotten for past hotel guests.GDPR erasure saga purges identity rows; reservation history retained as anonymized projection by reservation-service.
9Lock integration. Smart-lock services (lock-service) need short-lived API keys per property.Issued by iam-service, scope lock:dispense, tenant + property bound, 24-h TTL with auto-rotate. See 09 Lock Integration.
10Multi-language UI. Pashto / Dari / English / Arabic users.iam-service is locale-agnostic; error codes (MELMASTOON.IAM.*) carry userMessageKey resolved by frontend i18n bundle.

9. Service Tier & Tier-Aligned Targets

DimensionTarget
Service tierT0 (platform-critical)
Availability SLO99.99% (≤ 4.38 min downtime / month)
JWKS availability99.999%
Login latency p99< 800 ms
Refresh latency p95< 100 ms
Recovery time objective (RTO)60 min cross-region
Recovery point objective (RPO)5 min
On-call rotation24×7 follow-the-sun

10. Cross-Bundle Index

See deep-bundle index in docs/03-microservices/iam-service.md.