Skip to main content

DEPLOYMENT_TOPOLOGY — inventory-service

Sibling: LOCAL_DEV_SETUP · OBSERVABILITY · FAILURE_MODES · SECURITY_MODEL

Strategic anchors: 02 §12 GCP Reference Topology · 04 §11 Pub/Sub topology

inventory-service runs on Google Cloud Run (managed, regional) using the platform's standard NestJS base image. It ships as three Cloud Run services to keep their scaling envelopes and IAM scopes independent: the request-handling API, the hold-expiry sweeper, and the calendar-extender / partition-rotator job runner.


1. Runtime

PropertyValue
LanguageTypeScript
RuntimeNode 20 LTS
FrameworkNestJS 10 (composition root only; domain framework-free)
Container basegcr.io/melmastoon-platform/node-20-distroless:<sha>
Boot scriptnode --enable-source-maps dist/main.js (telemetry initialized first)
Health endpointsGET /internal/health (liveness), GET /internal/ready (readiness; checks DB pool + Pub/Sub publisher)
Graceful shutdownSIGTERM → drain in-flight HTTP and inbox handlers; max 30 s

2. Cloud Run services

2.1 inventory-service (API + inbox handlers)

SettingValue
Regionme-central1 (primary), asia-south1 (active for region-pinned tenants)
Min replicas3 per region
Max replicas50 per region (search peaks during meta-search bursts)
Concurrency per instance80
CPU2 vCPU (always-allocated)
Memory1 GiB
VPC connectormelmastoon-private-connector
Egressprivate VPC (Cloud SQL, Memorystore, Pub/Sub via private endpoints)
Ingressinternal + load-balancer (Kong upstream)
AuthenticationIAM (service-to-service); Pub/Sub push principal allowlisted; Cloud Scheduler principal allowlisted
Service accountinventory-svc@<project>.iam.gserviceaccount.com

2.2 inventory-hold-expiry-sweeper (Cloud Run service, single replica)

SettingValue
ScheduleCloud Scheduler */30 * * * * * (every 30 s) → HTTPS POST to /internal/jobs/expire-holds
Min/Max replicas1 / 1 (single writer)
CPU1 vCPU (CPU-on-request)
Memory512 MiB
Service accountinventory-sweeper@<project>.iam.gserviceaccount.com (RLS-bypass restricted to room_allocations; per-row SET app.tenant_id)

Single-replica is deliberate: concurrent sweepers contending for the same hold row would not corrupt anything (the work is idempotent and uses FOR UPDATE SKIP LOCKED) but they would burn CPU racing each other.

2.3 inventory-calendar-jobs (extender, partition rotator, reconciler)

SettingValue
Schedulethree Cloud Scheduler entries: extend-calendar-horizon 02:00 UTC, rotate-partitions 02:30 UTC, reconcile-calendar-summary 03:00 UTC
Min/Max replicas0 / 2
CPU2 vCPU
Memory2 GiB (large in-memory upsert batches)
Service accountinventory-jobs@<project>.iam.gserviceaccount.com (BYPASSRLS for room_type_inventory_daily, availability_calendars)

3. Infrastructure dependencies

DependencyProvisioning
Cloud SQL Postgres 15 (HA primary + read replica)Shared instance with other PMS-core services; schema inventory; per-service IAM database users; btree_gist extension required
Memorystore (Redis 7)Shared with PMS-core for hot caches; namespaced keys inventory:*; used for availability-search 30-second cache and cache-stampede guards
GCP Pub/SubOne topic per produced subject; pull subscriptions for inbox; DLQs per subscription; ordering enabled per <tenantId>:<aggregateId>
Cloud KMSCloud SQL CMEK for inventory schema; no field-level encryption keys (no PII)
Secret ManagerSync-service device-binding signing keys are referenced (read-only) for snapshot-pull validation
Cloud SchedulerThree cron entries (sweeper 30 s, extender daily, partition rotator daily, reconcile daily)
Cloud StorageLong-term partition export bucket gs://melmastoon-cold-inventory/
VPC Service ControlsMember of melmastoon-prod-perimeter
BigQueryAnalytical sink for inventory.* events (90 d operational, infinite analytical)

4. Network topology

Internet ──► Kong (Cloud Run) ──► inventory-service (Cloud Run, internal+LB)

├── Cloud SQL (private endpoint)
├── Memorystore (VPC connector)
├── Pub/Sub (private service connect)
└── KMS, Secret Manager (private endpoints)

Pub/Sub push ──► inventory-service `/internal/events/*` (IAM-gated)
Cloud Scheduler ──► sweeper `/internal/jobs/expire-holds` (IAM-gated)
Cloud Scheduler ──► jobs `/internal/jobs/{extend-calendar-horizon|rotate-partitions|reconcile-calendar-summary}`

There is no direct public ingress to inventory-service. The only public surface is via bff-tenant-booking-service, bff-consumer-service, and bff-backoffice-service, fronted by Kong and Cloudflare.


5. Deploy & release

StageMechanism
BuildGitHub Actions → Cloud Build → distroless image; SBOM + Cosign signature attached
Image registryArtifact Registry gcr.io/melmastoon-platform/inventory-service:<git-sha>
Migrationsnode-pg-migrate up runs as a Cloud Build step before Cloud Run revision rollout; backwards-compatible only
Canary5% traffic split for 30 minutes; abort and roll back on alert ladder (RESV-INV-001..014) firing for 10 min — RESV-INV-001 (false overbooking) aborts immediately
Rollbackgcloud run services update-traffic --to-revisions=<prev>=100; image stays in registry
PromotionManual gate from staging to prod; release notes link to PRs and Jepsen run id

6. Resource sizing rationale

  • Min 3 replicas (API): allocation is hot-path; a cold start during the booking saga would push p99 above the 200 ms SLO. Three replicas survive a single-AZ blip and absorb meta-search bursts.
  • Concurrency 80: Drizzle pool sized 60 connections per instance with overflow blocking; 80 × 3 = 240 max in-flight queries against the shared Cloud SQL primary.
  • Single sweeper: the 30-s sweep batch is bounded (~100–500 expired holds typical) and idempotent; concurrency would only add lock-contention overhead.
  • Calendar jobs separate: large insert batches (extend horizon for hundreds of properties) should not contend with hot-path API instances for connection pool slots.

7. Region & residency

  • me-central1 (Doha): primary; serves Afghan, Iranian (where lawful), GCC, Tajik tenants by default.
  • asia-south1 (Mumbai): secondary; serves South Asia tenants.
  • Tenant pinning is read from tenant.region; cross-region writes blocked at the connection middleware. Cross-region reads only for audit-service and analytics-service.

8. Cross-references