Skip to main content

DEPLOYMENT_TOPOLOGY — bff-tenant-booking-service

Sibling: DATA_MODEL · SECURITY_MODEL · LOCAL_DEV_SETUP

Cross-cutting: 02 Enterprise Architecture · §4 GCP Reference Architecture

1. Runtime

PropertyValue
ComputeGoogle Cloud Run (managed)
Region (primary)asia-south1 (Mumbai)
Region (DR-warm)europe-west4 (Eemshaven)
ContainerDistroless Node 20, multi-stage build, non-root node user, read-only root FS
Min instances3 (per region)
Max instances30 (per region; raised to 60 in flashSale mode)
Concurrency per instance60
CPU2 vCPU, always-allocated
Memory1 GiB
Startup latency budget< 800 ms
Request timeout30 s (matches longest upstream chain)
VPC connectorbff-connector-asia-south1 (private egress)

2. Ingress

Tenant subdomain (kabul-grand-hotel.melmastoon.ghasi.io) OR custom domain (booking.<tenant>.com)


Cloud DNS (CNAME → GCLB anycast)


Global HTTPS Load Balancer
├── Cloud Armor (WAF + bot rules)
├── Cloud CDN (cache for documented public GETs)
├── SNI cert (managed by Certificate Manager; per-tenant for custom domains)


Serverless NEG → Cloud Run (bff-tenant-booking-service)

Custom domains: tenants submit DNS CNAME → booking.tenant.melmastoon.ghasi.io; we provision a managed cert via Cloud Certificate Manager (DNS-validated).

3. Egress (upstream connections)

All upstreams reached over internal Cloud Run-to-Cloud Run via VPC connector with Google ID tokens minted from bff-tenant-sa.

UpstreamHostnameAuthTimeoutRetries
tenant-servicetenant.melmastoon.internalGoogle ID token400 ms0
theme-config-servicetheme.melmastoon.internalGoogle ID token500 ms1
property-serviceproperty.melmastoon.internalGoogle ID token800 ms1
inventory-serviceinventory.melmastoon.internalGoogle ID token700 ms1
pricing-servicepricing.melmastoon.internalGoogle ID token800 ms0 (quote); 1 (cheapest)
reservation-servicereservation.melmastoon.internalGoogle ID token1500 ms1 (with idem-key)
payment-gateway-servicepayment.melmastoon.internalGoogle ID token2000 ms0
billing-servicebilling.melmastoon.internalGoogle ID token600 ms1
lock-integration-servicelock.melmastoon.internalGoogle ID token600 ms0 (soft)
ai-orchestrator-serviceai.melmastoon.internalGoogle ID token1200 ms0
bff-consumer-servicebff-consumer.melmastoon.internalGoogle ID token + shared HMAC key800 ms0

4. Stateful dependencies

DependencyTypeRegionHA
Memorystore (Redis 7) — cache tierbff-tenant-cache-asia-south1, 5 GiB, standardasia-south1Standby + auto-failover
Memorystore (Redis 7) — session tier (no eviction)bff-tenant-session-asia-south1, 3 GiB, standardasia-south1Standby + auto-failover
Cloud SQL (Postgres 16)bff-tenant-db-asia-south1, db-custom-2-8192asia-south1Regional HA + cross-region read replica in europe-west4
Pub/Sub topicsmelmastoon.bff.tenant.*globaln/a
Secret Managerhandoff-hmac (current+previous), pepper, recaptchaglobalreplicated automatically

Two Memorystore instances avoid cache eviction pressure displacing live booking-draft state.

5. CI/CD pipeline

GitHub PR → GitHub Actions
├── Lint + typecheck + unit + integration + contract tests
├── Build container (Cloud Build)
├── Trivy scan (block high/critical CVE)
├── Cosign sign with Fulcio identity
├── Push to Artifact Registry
├── Deploy to dev Cloud Run (no-traffic + smoke test → 100%)
├── Manual approval → stage
└── Manual approval → prod (canary 5% → 25% → 100% over 30 min, with metric guardrails)

Binary authorization on prod cluster requires Cosign signature.

6. Traffic management

  • Default routing: tenant subdomain or custom domain → GCLB → nearest healthy region.
  • Canary control: Cloud Deploy + Cloud Run revisions; rollback budget 5 minutes if SLO burn detected.
  • Flash-sale mode: tenant flag promotes prod max-instances to 60; pre-warmed via Memorystore cache priming.
  • Custom-domain provisioning: Cloud Certificate Manager (DNS-validated); BFF reads tenant→domain map from tenant-service.

7. Configuration

SourceWhat
Cloud Run env varsNon-secret toggles
Secret Manager (file mounts)All secrets per SECURITY_MODEL §9
Cloud Run Service YAMLSizing, concurrency, scaling, VPC connector
bff-tenant-flags (Memorystore)Feature flags + sample rates (refresh 30 s)

8. Networking

  • VPC: melmastoon-prod-vpc.
  • Subnet (Cloud Run connector): bff-tenant-connector-asia-south1 (10.20.5.0/28).
  • Private Service Access for Cloud SQL.
  • Memorystore via VPC connector private IP.
  • Egress NAT only for payment-gateway-service provider redirects (which actually leave from the gateway service, not this BFF — kept for completeness).

9. Cost posture

ItemEstimated monthly @ 150 RPS steady
Cloud Run~$280
Memorystore (cache 5 GiB + session 3 GiB)~$280
Cloud SQL (db-custom-2-8192 + HA)~$240
Pub/Sub~$50
Cloud CDN~$30
Cloud Armor~$30
Cert Manager (custom domains × 50)~$50
Logging + Trace + Monitoring~$80
Total~$1,040 / month / region

10. Disaster recovery

  • RPO: 5 min (Cloud SQL PITR; Memorystore is ephemeral cache + best-effort session).
  • RTO: 30 min (DNS + Cloud Run redeploy in DR).
  • Quarterly DR drill: cut traffic to europe-west4; verify booking flow against replicated tenant catalog.
  • Booking drafts in flight at the moment of failover: client receives MELMASTOON.BFF.TENANT.DRAFT_NOT_FOUND and is redirected to re-search; idempotency rows in Cloud SQL prevent double-charges if the user retries confirm after failover.

11. Custom-domain operations

  • Tenants self-serve domain claim in tenant-service via DNS CNAME challenge.
  • Cert Manager auto-provisions via DNS-01.
  • BFF reads tenant.config.customDomains[] and rejects requests for unclaimed domains with 404.
  • DNS CAA records on tenant zones recommended (googletrust.com).

12. Health endpoints

  • /health/live: returns 200 if process running.
  • /health/ready: returns 200 if Memorystore + Postgres + at least 80% of upstream circuits closed.
  • Cloud Run liveness + readiness probes pointed at these.