file-storage-service — AI_INTEGRATION

Companion: SERVICE_OVERVIEW · APPLICATION_LOGIC §3.2 · SECURITY_MODEL · 08 AI Architecture · .cursor/rules/95-ai.mdc

This service performs no AI inference itself. Every AI call is delegated to ai-orchestrator-service over an internal HTTP port (AIClient); the orchestrator handles model routing (Vertex AI Gemini for cloud, ONNX Runtime for edge), prompt safety, output schema validation, budget enforcement, HITL queues, and AIProvenance stamping. file-storage-service consumes orchestrated results, persists them on the relevant FileObject row, and emits the corresponding domain events.

The four AI surfaces used by file-storage-service are: (1) image safety classification, (2) alt-text generation, (3) PII redaction on ID scans, (4) EXIF / metadata redaction validation. All four are assistive — they may fail, refuse, or be skipped without breaking the upload flow; the only AI-gated state transition is quarantined on a hard safety violation.

1. AI surfaces in this service

#	Purpose	Trigger	Model class	HITL	Required AIProvenance
1	Image safety classifier	After `scan.passed.v1` for `property_photo`, `theme_asset`, `tenant_logo`, `notification_attachment`	Vision classifier (Gemini 2.x or onnx fallback)	Auto-quarantine on `unsafe`; HITL for `borderline`	yes
2	Alt-text generation	After safety pass for `property_photo`	Vision-language LLM	none (auto-apply if confidence ≥ 0.75; otherwise queue for staff review)	yes
3	OCR + PII redaction	Retention sweep for `pii_id_scan` policy at `redactionAfterDays`	DLP / Gemini multimodal	yes (Tenant.Compliance preview)	yes
4	EXIF / metadata sanitization	Optimizer pipeline	Deterministic library (sharp); LLM only to flag suspicious EXIF leaks	none	no (deterministic)

Surface 4 is not an AI call but is included for completeness because the optimizer worker rejects images with embedded GPS / camera-owner metadata in PII-sensitive scopes.

2. The orchestrator port

// application/ports/ai-client.ts
export interface AIClient {
  classifyImageSafety(req: ClassifyImageSafetyReq): Promise<ClassifyImageSafetyRes>;
  generateAltText(req: GenerateAltTextReq): Promise<GenerateAltTextRes>;
  ocrRedact(req: OcrRedactReq): Promise<OcrRedactRes>;
  describeExifLeak(req: DescribeExifLeakReq): Promise<DescribeExifLeakRes>;
}

export type ClassifyImageSafetyReq = {
  tenantId: TenantId;
  fileObjectId: FileObjectId;
  signedReadUrl: string;          // short-lived; orchestrator fetches bytes
  scope: Scope;
  preferredLocale: Locale;
};
export type ClassifyImageSafetyRes = {
  verdict: 'safe' | 'borderline' | 'unsafe';
  categories: { name: 'sexual' | 'violent' | 'hate' | 'illegal' | 'gore' | 'self_harm'; score: number }[];
  reason: string | null;          // localized
  hitlDecisionId: DecisionId | null;
  provenance: AIProvenance;
};

The orchestrator returns AIProvenance shaped per @ghasi/contracts-melmastoon:

export type AIProvenance = {
  decisionId?: DecisionId;        // dec_<ULID>
  modelRef: string;               // 'vertex/gemini-2.5-flash@2026-04-01'
  promptHash: string;             // sha256 of (system + user template id)
  inputTokens: number;
  outputTokens: number;
  costMicroUsd: number;
  latencyMs: number;
  safety: { preCheck: 'pass' | 'fail'; postCheck: 'pass' | 'fail' };
  hitl: { required: boolean; status?: 'pending' | 'approved' | 'rejected' };
  routing: { tier: 'edge' | 'cloud'; reason: string };
  createdAt: string;
};

Persisted as file_objects.ai_provenance jsonb. Missing provenance on an AI-derived attribute fails the MELMASTOON.AI.PROVENANCE_MISSING guard at the ORM layer.

3. Prompt templates

Templates live in services/file-storage-service/contracts/ai/prompts/*.md and are versioned by hash. Each carries a template_id that becomes part of promptHash.

3.1 `image-safety.v1`

SYSTEM: You are a hotel-asset image-safety classifier. Categorize the image strictly into:
  safe       — appropriate for a hotel public listing or tenant booking site.
  borderline — minor concerns (e.g., partial nudity that may be artistic, ambiguous violence).
  unsafe     — sexual, gore, hate symbols, illegal activity, child safety violations.

Respond with ONLY a JSON object matching the schema:
  { "verdict": "safe"|"borderline"|"unsafe",
    "categories": [{"name": <enum>, "score": <0..1>}],
    "reason": <short string in {locale}> }

Hotel context: this image is being uploaded as a property photo for a hotel listing. Bedrooms, bathrooms, lobbies, food, exterior, and pools are expected.

Routing: cloud (Gemini 2.x vision); fallback to edge ONNX safety model on cloud unavailability with a degraded_mode=true hint propagated to the consumer.

3.2 `alt-text.v1`

SYSTEM: You write a single-sentence accessible alt text for a hotel listing photo.
Constraints:
  - Max 140 characters.
  - Describe the visual subject: room type, view, time of day, distinctive feature.
  - Do NOT mention the brand, the photographer, or non-visible facts.
  - Output language: {locale}.

Output ONLY a JSON object: { "altText": <string>, "confidence": <0..1>, "tags": [<short keyword>...] }

Routing: cloud preferred (better captions); edge fallback yields shorter generic captions.

Auto-apply threshold = confidence ≥ 0.75. Below threshold → row is queued in ai-orchestrator-service's HITL queue with decisionRef stamped on the FileObject; staff review in backoffice → on approval the alt text is written via applyAIAltText.

3.3 `id-scan-redact.v1`

SYSTEM: You receive an ID document scan (passport / national ID / driver licence). Identify regions containing PII fields:
  - given name, family name, date of birth, document number, MRZ, machine-readable zone, address, photo, signature.

Output a redaction map: list of bounding boxes (x,y,width,height in pixels) and the field type for each. Do NOT transcribe the values themselves.
Schema: { "boxes": [ { "field": <enum>, "x": int, "y": int, "w": int, "h": int, "confidence": <0..1> } ] }

The redaction map is then applied deterministically by the optimizer worker (sharp + black rectangle composite). The redacted bytes become a new FileObject with aliasOfFileId = original.id and tags += ['redacted','pii_redacted']. The original is hard-purged (subject to legal hold). The downstream reservation-service consumer receives a melmastoon.file.optimization.completed.v1 and updates the Guest.idScanRef to the redacted file id.

Routing: cloud only (Cloud DLP image redaction API + Vertex Gemini multimodal as fallback). Never edge — model size and PII sensitivity preclude on-device.

3.4 `exif-leak-describe.v1` (advisory only)

Used to localize a human-readable warning in backoffice when the optimizer detects GPS coordinates or owner metadata in a theme_asset or property_photo. Output is a localized string; no state change.

4. Routing policy (what the orchestrator decides)

The orchestrator chooses model tier via the ai-orchestrator-service policy engine; this service's call sites pass the following hints:

Surface	Tier preference	Edge fallback acceptable?
Image safety	cloud	yes (with `degraded_mode` event attribute; stricter quarantine threshold on edge: borderline → quarantined)
Alt-text	cloud	yes (lower-quality acceptable)
OCR redact	cloud	no — fail fast, retry next sweep window
EXIF describe	cloud	yes (no state change either way)

A tenant on a low-bandwidth POP that has degraded cloud connectivity gets the edge route automatically; the resulting AIProvenance.routing.tier='edge' makes the choice auditable.

5. Safety controls

Pre-call safety: orchestrator denies if the request payload itself trips its content classifier (e.g., a malicious filename containing prompt-injection text). Surfaces as MELMASTOON.AI.REFUSED_SAFETY.
Post-call safety: orchestrator re-runs a safety classifier on the model output (e.g., alt text containing slurs). Surfaces as MELMASTOON.AI.REFUSED_SAFETY and the alt text is dropped (file remains ready without alt text).
Budget: per-tenant per-purpose budget tracked by orchestrator; exceeded → MELMASTOON.AI.REFUSED_BUDGET (file remains ready, no alt text or safety verdict; emits a degraded_mode log line).
HITL: for surface 1 borderline and surface 3 always; HITL queue lives in ai-orchestrator-service. The file is held in scanning status (gated reads) until HITL resolves; SLO 30 min business-hours, otherwise auto-quarantine after 24 h.

6. Sequence — image safety + alt text on upload

[scan worker] scan.passed.v1
   │
   ▼
[file-storage-service] InternalScanCallbackUseCase
   ├─ recordScanPassed → status=ready
   ├─ enqueue ImageOptimizerPort (variants)
   └─ AIClient.classifyImageSafety  ──┐
                                       │ orchestrator → Gemini vision
                                       ▼
                                 verdict + provenance
   ├─ if 'safe'      → AIClient.generateAltText → applyAIAltText if conf>=0.75
   ├─ if 'borderline'→ holdFile=true; orchestrator opens HITL; stays 'scanning'
   └─ if 'unsafe'    → recordScanFailed (synthetic ai_safety scanResult); status=quarantined
                                                                                   ⤷ emits scan.failed.v1

The synthetic ScanResult row carries scanner='cloud_dlp' and threats=['ai_safety:<category>'] so SIEM treats it identically to a malware quarantine.

7. Failure modes

Failure	Effect	Detection
Orchestrator unreachable	image stays `ready` without alt text or safety verdict; AIProvenance not written; `degraded_mode` log + dashboard alert	`ai_calls_failed_total{purpose='image_safety'}`
`MELMASTOON.AI.REFUSED_BUDGET`	same — alt text skipped this period	budget dashboard
OCR redact fails repeatedly (3 attempts)	original ID scan retained; sweeper re-tries next window; alert at 24 h backlog	`ai_redact_failed_total`
HITL backlog > 24 h	auto-quarantine borderline images	dashboard `hitl_oldest_age_seconds`
Model rolls (modelRef change)	next batch carries new modelRef in `AIProvenance` — tooling diffs results	nightly drift report

8. Observability hooks

Per AI call, this service records:

span attribute ai.purpose (image_safety / alt_text / ocr_redact / exif_describe)
span attribute ai.modelRef, ai.cost_micro_usd, ai.tier
counter file_storage_ai_calls_total{purpose,outcome}
histogram file_storage_ai_latency_seconds{purpose,tier}
counter file_storage_ai_hitl_pending (gauge)

Dashboard panels in OBSERVABILITY §4.

9. Privacy boundary with the orchestrator

For image safety / alt text on public_media scopes: the orchestrator may fetch bytes via short-lived signed URL (TTL 60 s), never store them.
For OCR redact on pii_id_scan: the orchestrator runs in a dedicated PII-cleared project (melmastoon-ai-pii-prod) with stricter logging suppression. Bytes are processed in-memory only; no transient bucket.
All AI calls cross the boundary over mTLS; the orchestrator carries the caller tenantId + purpose in the bearer JWT claims so its budgeting and audit can attribute correctly.

10. Test anchors

application/use-cases/internal-scan-callback.spec.ts — verifies AI call sequencing on scan.passed.v1.
application/use-cases/ocr-redact.saga.spec.ts — verifies redaction creates a new FileObject with tags=['redacted'], original is hard-purged unless held.
infrastructure/ai/ai-client-http.adapter.spec.ts — contract test against orchestrator OpenAPI.
domain/file-object/ai-provenance.spec.ts — refuses to apply alt text without provenance.

1. AI surfaces in this service​

2. The orchestrator port​

3. Prompt templates​

3.1 image-safety.v1​

3.2 alt-text.v1​

3.3 id-scan-redact.v1​

3.4 exif-leak-describe.v1 (advisory only)​

4. Routing policy (what the orchestrator decides)​

5. Safety controls​

6. Sequence — image safety + alt text on upload​

7. Failure modes​

8. Observability hooks​

9. Privacy boundary with the orchestrator​

10. Test anchors​