file-storage-service
Companion: README · 02 Enterprise Architecture §3 · 05 API Design · 06 Data Models · 07 Security & Tenancy · Bundle
1. Identity
| Field | Value |
|---|---|
| Service name | file-storage-service |
| Workspace package | @ghasi/service-file-storage |
| Bounded context | Storage (Generic) |
| Domain class | Generic / supporting |
| Owning team | Platform Infrastructure |
| On-call tier | Tier 2 (data-loss class) |
| Phase | Phase 0 (foundation; required for property photos, invoice PDFs, ID scans, theme assets) |
2. Purpose
file-storage-service is the single platform API for file upload, download, and lifecycle management. It encapsulates Google Cloud Storage (the byte store) behind a tenant-scoped, signed-URL workflow with virus scanning, image optimization, retention enforcement, GDPR erasure, audited access, and per-tenant prefix isolation. Every other service that handles binary content (property-service for photos, notification-service for invoice PDF attachments, billing-service for invoice and receipt scans, reservation-service for guest ID scans, lock-integration-service for vendor reports, tenant-service for logos, theme-config-service for theme assets) goes through this service rather than talking to GCS directly.
It does not generate documents (invoices come from billing-service, notifications from notification-service), it does not decide what to do with files (consumers own that), and it does not own theme tokens or property metadata. It owns bytes, metadata about bytes, and the policies that govern bytes.
3. Aggregates Owned
| Aggregate | Lifecycle | ID prefix |
|---|---|---|
FileObject | initiated → uploading → uploaded → scanning → ready → quarantined / archived → purged | med_ |
Bucket (logical, maps 1:1 to a GCS bucket + tenant prefix) | active → draining → archived | bkt_ |
UploadSession | open → completed → expired / aborted | ups_ |
Variant (image / video transcode) | pending → ready → failed | var_ |
ScanResult | pending → passed / failed | scn_ |
RetentionPolicy | draft → active → superseded | ret_ |
AccessGrant (signed URL issuance audit row) | issued → expired / revoked | grt_ |
med_ is the canonical platform prefix for any media reference held by other services (declared in NAMING.md).
4. Primary APIs
REST under /api/v1 (full surface in API_CONTRACTS):
POST /api/v1/files/uploads— initiate upload, returns signed PUT URLPOST /api/v1/files/uploads/{ups}/confirm— confirm bytes uploaded; triggers scan + optimizationPOST /api/v1/files/uploads/{ups}/abort— abandon partial upload (cleanup)GET /api/v1/files/{med}— metadataPOST /api/v1/files/{med}/download-url— issue scoped, time-bound signed download URLGET /api/v1/files/{med}/variants— list image/video variantsDELETE /api/v1/files/{med}— soft delete (status →archived)POST /api/v1/files/{med}/restore— restore within retention windowGET /api/v1/files/{med}/access-log— audited access historyPOST /api/v1/files/erasure— GDPR erasure for a guest or tenant scopeGET /api/v1/files/quotas— per-tenant usage and quotaPOST /internal/v1/files/scan-callback— internal callback from scan workers (mTLS)POST /internal/v1/files/optimize-callback— internal callback from Cloud Run optimizer (mTLS)POST /internal/v1/files/cdn/invalidate— internal CDN invalidation (mTLS)
5. Events
Published (melmastoon.file.*.v1):
melmastoon.file.upload.initiated.v1·melmastoon.file.upload.completed.v1·melmastoon.file.upload.failed.v1melmastoon.file.scan.requested.v1·melmastoon.file.scan.passed.v1·melmastoon.file.scan.failed.v1(file is quarantined)melmastoon.file.optimization.completed.v1melmastoon.file.deleted.v1melmastoon.file.access.denied.v1(cross-tenant or revoked URL attempt)melmastoon.file.retention.expired.v1melmastoon.file.erasure.completed.v1melmastoon.file.bucket.quota_warning.v1
Consumed:
melmastoon.tenant.guest.erasure_requested.v1— purge all files tagged with the guest'sgst_ID.melmastoon.tenant.deleted.v1— cascade purge of every file under the tenant's prefix (after legal retention window).melmastoon.property.photo.removed.v1— soft-delete the underlyingFileObjectfor that photo.melmastoon.billing.invoice.issued.v1— apply the long-termtax_complianceretention policy to the generated PDF (informational; the producer also passes the policy hint at upload time).melmastoon.reservation.checked_out.v1— ID scans transition into the redaction-eligible bucket per policy.
6. Storage
- Primary blob store: Cloud Storage. One bucket per environment per data class (
melmastoon-media-prod,melmastoon-private-prod,melmastoon-archive-prod). Tenant isolation by mandatory key prefixtenants/{tenantId}/<scope>/.... Signed URLs issued only with that exact prefix. - Metadata + index + access control: Cloud SQL Postgres (
file_storageschema). Tables:file_objects,upload_sessions,variants,scan_results,retention_policies,access_grants,quotas,outbox,inbox,audit_events. Every multi-tenant table carriestenant_id+ an RLS policy<table>_tenant_isolation. - CDN: Google Cloud CDN in front of
melmastoon-media-prodforpublic-readcontent (property photos, theme assets, tenant logos). Cache key includes tenant prefix; invalidation triggered onfile.deleted.v1andfile.optimization.completed.v1. - Hot lookups: Memorystore (Redis 7) for signed URL caching, dedupe-by-hash lookups, and per-tenant quota counters.
7. Multi-Tenancy
- Every row carries
tenant_id; RLS policies named<table>_tenant_isolationfilter againstcurrent_setting('app.tenant_id')::uuidand areFORCEd. - Every GCS object key starts with
tenants/{tenantId}/. Upload signed URLs lock the object name pattern; a leaked URL cannot target a different tenant's prefix because the URL is signed against the exact resource path. - Download signed URLs include a tenant-scoped resource path and a 5-minute TTL by default (max 1 hour). Revoked URLs are blacklisted in Redis until expiry.
- Cross-tenant access attempts produce
melmastoon.file.access.denied.v1and aMELMASTOON.GENERAL.RESOURCE_NOT_FOUND(404) — never 403, to avoid leaking existence.
8. Hotel-Specific Behaviors
- Property photos are the highest-volume class (10–50 per property × N properties). Server-side optimization to WebP + AVIF in three sizes (
thumb320 px,hero1280 px,full1920 px) is mandatory before the photo flips toready— bandwidth in target markets is constrained. - Guest ID scans uploaded at check-in (
reservation-service) are highly sensitive PII. They land in theprivatebucket with at-rest CMEK + apii_id_scanretention policy that enforces redaction after the configured jurisdiction window (default 30 days for AF, 90 days for FR, configurable per tenant). - Tenant logos and theme assets are CDN-cached with an 8 h TTL and explicitly invalidated on update.
- Generated invoices carry a long-term
tax_complianceretention policy (default 7 years; per-jurisdiction override). - Vendor lock reports (
lock-integration-service) carry a 12-month rolling retention policy.
9. Edge Cases (top-of-mind)
- Download before scan completes → block with
409 MELMASTOON.FILE.SCAN_PENDING. The producer service receivesfile.scan.passed.v1to flip its own status (e.g., propertyPhoto.uploaded → ready). - Quota exceeded at initiate →
402 MELMASTOON.FILE.QUOTA_EXCEEDED; emitbucket.quota_warning.v1at 80 % and 95 %. - Resumable upload abandoned (no confirm within session TTL of 1 h) → cleanup job deletes the partial GCS object and emits
upload.failed.v1withreason: 'session_expired'. - CDN cache lag after delete → invalidate request issued synchronously; clients are warned via
Cache-Control: max-age=300to limit blast radius. Sensitive content is never served via CDN. - Signed URL leak → audit row
AccessGrantrecords every issuance with caller, tenant, IP, UA, and resource path; revocation is implemented via Redis blacklist of the URL signature until natural expiry. - Cross-tenant signed URL theft → impossible by construction: the signature is bound to a tenant-prefixed resource path, and the URL's
Hostvalidates against the bucket. A theft attempt fails at GCS, and the resulting access log is exported toBigQueryfor SIEM correlation. - Large file > 50 MB → must use resumable / chunked upload flow (
uploadType=resumable); the API surfaces a chunk size of 8 MiB. - Duplicate detection → SHA-256 of the bytes is required at confirm; if a file with the same hash already exists for the tenant + scope, the new
FileObjectrow is created as an alias pointing to the same GCS object (refcount).
10. Sync Contract
file-storage-service does not replicate to the Electron desktop SQLite store. The desktop interacts via the same signed-URL workflow over the network when online. When offline, a small queue of pending uploads is held on disk and replayed when the device reconnects. See SYNC_CONTRACT for the offline upload queue protocol and idempotency rules.
11. AI Touchpoints
All AI calls are routed through ai-orchestrator-service (no direct provider calls). Capabilities:
file.image.safety_classify— pre-scan classifier on uploaded photos to flag explicit / violent content (HITL queue for tenant-policy review).file.image.alt_text_draft— accessibility alt text in tenant locales (HITL accept-only onproperty-servicephotos andtheme-config-serviceassets).file.image.exif_redact— strip GPS + device EXIF onpii_id_scanand any image flagged ascontains_guest_likeness=true.file.id_scan.ocr_redact— for guest ID scans, OCR + selective redaction of non-essential fields (HITL gated; required byreservation-serviceworkflow).
Every AI artifact persists with AIProvenance. See AI_INTEGRATION.
12. Failure Posture
- GCS unavailable → upload returns
503 MELMASTOON.FILE.PROVIDER_UNAVAILABLEwithretryAfter. Producers must not block their own write paths on a file upload. - Scan worker backlog → scans queue with backpressure on Pub/Sub; reads of unscanned files block (
409 SCAN_PENDING). Quarantined files stay quarantined until manual override. - Optimizer failure → the original
fullvariant remains usable;thumb/herorequests fall back tofullwith on-the-fly CDN resizing as a degraded mode. - CDN invalidation failure → re-queued with exponential backoff; alert if backlog > 50.
- Retention sweeper failure → alert immediately; this is a compliance boundary.
13. Consumers (SLO Perspective)
| Consumer | Read pattern | Latency target | Staleness tolerance |
|---|---|---|---|
bff-tenant-booking-service (photo URLs) | hot read via CDN | p95 80 ms (CDN) | up to 8 h CDN TTL |
bff-consumer-service (search photos) | hot read via CDN | p95 80 ms | up to 8 h CDN TTL |
bff-backoffice-service (signed URLs) | per-request signed URL | p95 200 ms | n/a (signed) |
notification-service (PDF attachments) | per-request signed URL | p95 200 ms | n/a |
billing-service (invoice store + re-issue) | metadata + signed URL | p95 200 ms | n/a |
reservation-service (ID scans) | per-request signed URL with strict scope | p95 200 ms | n/a |
property-service (photo lifecycle) | event-driven on scan.passed.v1 | n/a | ≤ 30 s |
14. Ownership
- Team: Platform Infrastructure squad.
- Escalation: platform-infra-oncall → platform-oncall.
- Reviewers: platform architect, security reviewer (PII handling, bucket IAM, signed URL design), compliance reviewer (retention windows, GDPR erasure path).
15. Related Documents
- SERVICE_OVERVIEW, DOMAIN_MODEL, APPLICATION_LOGIC, API_CONTRACTS, EVENT_SCHEMAS, DATA_MODEL, SYNC_CONTRACT, AI_INTEGRATION, SECURITY_MODEL, OBSERVABILITY, TESTING_STRATEGY, DEPLOYMENT_TOPOLOGY, FAILURE_MODES, LOCAL_DEV_SETUP, SERVICE_READINESS, SERVICE_RISK_REGISTER, MIGRATION_PLAN.
- 02 Enterprise Architecture, 04 Event-Driven Architecture, 05 API Design, 06 Data Models, 07 Security & Tenancy, Naming, Error Codes.