Skip to main content

file-storage-service

Companion: README · 02 Enterprise Architecture §3 · 05 API Design · 06 Data Models · 07 Security & Tenancy · Bundle

1. Identity

FieldValue
Service namefile-storage-service
Workspace package@ghasi/service-file-storage
Bounded contextStorage (Generic)
Domain classGeneric / supporting
Owning teamPlatform Infrastructure
On-call tierTier 2 (data-loss class)
PhasePhase 0 (foundation; required for property photos, invoice PDFs, ID scans, theme assets)

2. Purpose

file-storage-service is the single platform API for file upload, download, and lifecycle management. It encapsulates Google Cloud Storage (the byte store) behind a tenant-scoped, signed-URL workflow with virus scanning, image optimization, retention enforcement, GDPR erasure, audited access, and per-tenant prefix isolation. Every other service that handles binary content (property-service for photos, notification-service for invoice PDF attachments, billing-service for invoice and receipt scans, reservation-service for guest ID scans, lock-integration-service for vendor reports, tenant-service for logos, theme-config-service for theme assets) goes through this service rather than talking to GCS directly.

It does not generate documents (invoices come from billing-service, notifications from notification-service), it does not decide what to do with files (consumers own that), and it does not own theme tokens or property metadata. It owns bytes, metadata about bytes, and the policies that govern bytes.

3. Aggregates Owned

AggregateLifecycleID prefix
FileObjectinitiated → uploading → uploaded → scanning → ready → quarantined / archived → purgedmed_
Bucket (logical, maps 1:1 to a GCS bucket + tenant prefix)active → draining → archivedbkt_
UploadSessionopen → completed → expired / abortedups_
Variant (image / video transcode)pending → ready → failedvar_
ScanResultpending → passed / failedscn_
RetentionPolicydraft → active → supersededret_
AccessGrant (signed URL issuance audit row)issued → expired / revokedgrt_

med_ is the canonical platform prefix for any media reference held by other services (declared in NAMING.md).

4. Primary APIs

REST under /api/v1 (full surface in API_CONTRACTS):

  • POST /api/v1/files/uploads — initiate upload, returns signed PUT URL
  • POST /api/v1/files/uploads/{ups}/confirm — confirm bytes uploaded; triggers scan + optimization
  • POST /api/v1/files/uploads/{ups}/abort — abandon partial upload (cleanup)
  • GET /api/v1/files/{med} — metadata
  • POST /api/v1/files/{med}/download-url — issue scoped, time-bound signed download URL
  • GET /api/v1/files/{med}/variants — list image/video variants
  • DELETE /api/v1/files/{med} — soft delete (status → archived)
  • POST /api/v1/files/{med}/restore — restore within retention window
  • GET /api/v1/files/{med}/access-log — audited access history
  • POST /api/v1/files/erasure — GDPR erasure for a guest or tenant scope
  • GET /api/v1/files/quotas — per-tenant usage and quota
  • POST /internal/v1/files/scan-callback — internal callback from scan workers (mTLS)
  • POST /internal/v1/files/optimize-callback — internal callback from Cloud Run optimizer (mTLS)
  • POST /internal/v1/files/cdn/invalidate — internal CDN invalidation (mTLS)

5. Events

Published (melmastoon.file.*.v1):

  • melmastoon.file.upload.initiated.v1 · melmastoon.file.upload.completed.v1 · melmastoon.file.upload.failed.v1
  • melmastoon.file.scan.requested.v1 · melmastoon.file.scan.passed.v1 · melmastoon.file.scan.failed.v1 (file is quarantined)
  • melmastoon.file.optimization.completed.v1
  • melmastoon.file.deleted.v1
  • melmastoon.file.access.denied.v1 (cross-tenant or revoked URL attempt)
  • melmastoon.file.retention.expired.v1
  • melmastoon.file.erasure.completed.v1
  • melmastoon.file.bucket.quota_warning.v1

Consumed:

  • melmastoon.tenant.guest.erasure_requested.v1 — purge all files tagged with the guest's gst_ ID.
  • melmastoon.tenant.deleted.v1 — cascade purge of every file under the tenant's prefix (after legal retention window).
  • melmastoon.property.photo.removed.v1 — soft-delete the underlying FileObject for that photo.
  • melmastoon.billing.invoice.issued.v1 — apply the long-term tax_compliance retention policy to the generated PDF (informational; the producer also passes the policy hint at upload time).
  • melmastoon.reservation.checked_out.v1 — ID scans transition into the redaction-eligible bucket per policy.

6. Storage

  • Primary blob store: Cloud Storage. One bucket per environment per data class (melmastoon-media-prod, melmastoon-private-prod, melmastoon-archive-prod). Tenant isolation by mandatory key prefix tenants/{tenantId}/<scope>/.... Signed URLs issued only with that exact prefix.
  • Metadata + index + access control: Cloud SQL Postgres (file_storage schema). Tables: file_objects, upload_sessions, variants, scan_results, retention_policies, access_grants, quotas, outbox, inbox, audit_events. Every multi-tenant table carries tenant_id + an RLS policy <table>_tenant_isolation.
  • CDN: Google Cloud CDN in front of melmastoon-media-prod for public-read content (property photos, theme assets, tenant logos). Cache key includes tenant prefix; invalidation triggered on file.deleted.v1 and file.optimization.completed.v1.
  • Hot lookups: Memorystore (Redis 7) for signed URL caching, dedupe-by-hash lookups, and per-tenant quota counters.

7. Multi-Tenancy

  • Every row carries tenant_id; RLS policies named <table>_tenant_isolation filter against current_setting('app.tenant_id')::uuid and are FORCEd.
  • Every GCS object key starts with tenants/{tenantId}/. Upload signed URLs lock the object name pattern; a leaked URL cannot target a different tenant's prefix because the URL is signed against the exact resource path.
  • Download signed URLs include a tenant-scoped resource path and a 5-minute TTL by default (max 1 hour). Revoked URLs are blacklisted in Redis until expiry.
  • Cross-tenant access attempts produce melmastoon.file.access.denied.v1 and a MELMASTOON.GENERAL.RESOURCE_NOT_FOUND (404) — never 403, to avoid leaking existence.

8. Hotel-Specific Behaviors

  • Property photos are the highest-volume class (10–50 per property × N properties). Server-side optimization to WebP + AVIF in three sizes (thumb 320 px, hero 1280 px, full 1920 px) is mandatory before the photo flips to ready — bandwidth in target markets is constrained.
  • Guest ID scans uploaded at check-in (reservation-service) are highly sensitive PII. They land in the private bucket with at-rest CMEK + a pii_id_scan retention policy that enforces redaction after the configured jurisdiction window (default 30 days for AF, 90 days for FR, configurable per tenant).
  • Tenant logos and theme assets are CDN-cached with an 8 h TTL and explicitly invalidated on update.
  • Generated invoices carry a long-term tax_compliance retention policy (default 7 years; per-jurisdiction override).
  • Vendor lock reports (lock-integration-service) carry a 12-month rolling retention policy.

9. Edge Cases (top-of-mind)

  • Download before scan completes → block with 409 MELMASTOON.FILE.SCAN_PENDING. The producer service receives file.scan.passed.v1 to flip its own status (e.g., property Photo.uploaded → ready).
  • Quota exceeded at initiate → 402 MELMASTOON.FILE.QUOTA_EXCEEDED; emit bucket.quota_warning.v1 at 80 % and 95 %.
  • Resumable upload abandoned (no confirm within session TTL of 1 h) → cleanup job deletes the partial GCS object and emits upload.failed.v1 with reason: 'session_expired'.
  • CDN cache lag after delete → invalidate request issued synchronously; clients are warned via Cache-Control: max-age=300 to limit blast radius. Sensitive content is never served via CDN.
  • Signed URL leak → audit row AccessGrant records every issuance with caller, tenant, IP, UA, and resource path; revocation is implemented via Redis blacklist of the URL signature until natural expiry.
  • Cross-tenant signed URL theft → impossible by construction: the signature is bound to a tenant-prefixed resource path, and the URL's Host validates against the bucket. A theft attempt fails at GCS, and the resulting access log is exported to BigQuery for SIEM correlation.
  • Large file > 50 MB → must use resumable / chunked upload flow (uploadType=resumable); the API surfaces a chunk size of 8 MiB.
  • Duplicate detection → SHA-256 of the bytes is required at confirm; if a file with the same hash already exists for the tenant + scope, the new FileObject row is created as an alias pointing to the same GCS object (refcount).

10. Sync Contract

file-storage-service does not replicate to the Electron desktop SQLite store. The desktop interacts via the same signed-URL workflow over the network when online. When offline, a small queue of pending uploads is held on disk and replayed when the device reconnects. See SYNC_CONTRACT for the offline upload queue protocol and idempotency rules.

11. AI Touchpoints

All AI calls are routed through ai-orchestrator-service (no direct provider calls). Capabilities:

  • file.image.safety_classify — pre-scan classifier on uploaded photos to flag explicit / violent content (HITL queue for tenant-policy review).
  • file.image.alt_text_draft — accessibility alt text in tenant locales (HITL accept-only on property-service photos and theme-config-service assets).
  • file.image.exif_redact — strip GPS + device EXIF on pii_id_scan and any image flagged as contains_guest_likeness=true.
  • file.id_scan.ocr_redact — for guest ID scans, OCR + selective redaction of non-essential fields (HITL gated; required by reservation-service workflow).

Every AI artifact persists with AIProvenance. See AI_INTEGRATION.

12. Failure Posture

  • GCS unavailable → upload returns 503 MELMASTOON.FILE.PROVIDER_UNAVAILABLE with retryAfter. Producers must not block their own write paths on a file upload.
  • Scan worker backlog → scans queue with backpressure on Pub/Sub; reads of unscanned files block (409 SCAN_PENDING). Quarantined files stay quarantined until manual override.
  • Optimizer failure → the original full variant remains usable; thumb/hero requests fall back to full with on-the-fly CDN resizing as a degraded mode.
  • CDN invalidation failure → re-queued with exponential backoff; alert if backlog > 50.
  • Retention sweeper failure → alert immediately; this is a compliance boundary.

13. Consumers (SLO Perspective)

ConsumerRead patternLatency targetStaleness tolerance
bff-tenant-booking-service (photo URLs)hot read via CDNp95 80 ms (CDN)up to 8 h CDN TTL
bff-consumer-service (search photos)hot read via CDNp95 80 msup to 8 h CDN TTL
bff-backoffice-service (signed URLs)per-request signed URLp95 200 msn/a (signed)
notification-service (PDF attachments)per-request signed URLp95 200 msn/a
billing-service (invoice store + re-issue)metadata + signed URLp95 200 msn/a
reservation-service (ID scans)per-request signed URL with strict scopep95 200 msn/a
property-service (photo lifecycle)event-driven on scan.passed.v1n/a≤ 30 s

14. Ownership

  • Team: Platform Infrastructure squad.
  • Escalation: platform-infra-oncall → platform-oncall.
  • Reviewers: platform architect, security reviewer (PII handling, bucket IAM, signed URL design), compliance reviewer (retention windows, GDPR erasure path).