file-storage-service — AI_INTEGRATION
Companion: SERVICE_OVERVIEW · APPLICATION_LOGIC §3.2 · SECURITY_MODEL · 08 AI Architecture · .cursor/rules/95-ai.mdc
This service performs no AI inference itself. Every AI call is delegated to ai-orchestrator-service over an internal HTTP port (AIClient); the orchestrator handles model routing (Vertex AI Gemini for cloud, ONNX Runtime for edge), prompt safety, output schema validation, budget enforcement, HITL queues, and AIProvenance stamping. file-storage-service consumes orchestrated results, persists them on the relevant FileObject row, and emits the corresponding domain events.
The four AI surfaces used by file-storage-service are: (1) image safety classification, (2) alt-text generation, (3) PII redaction on ID scans, (4) EXIF / metadata redaction validation. All four are assistive — they may fail, refuse, or be skipped without breaking the upload flow; the only AI-gated state transition is quarantined on a hard safety violation.
1. AI surfaces in this service
| # | Purpose | Trigger | Model class | HITL | Required AIProvenance |
|---|---|---|---|---|---|
| 1 | Image safety classifier | After scan.passed.v1 for property_photo, theme_asset, tenant_logo, notification_attachment | Vision classifier (Gemini 2.x or onnx fallback) | Auto-quarantine on unsafe; HITL for borderline | yes |
| 2 | Alt-text generation | After safety pass for property_photo | Vision-language LLM | none (auto-apply if confidence ≥ 0.75; otherwise queue for staff review) | yes |
| 3 | OCR + PII redaction | Retention sweep for pii_id_scan policy at redactionAfterDays | DLP / Gemini multimodal | yes (Tenant.Compliance preview) | yes |
| 4 | EXIF / metadata sanitization | Optimizer pipeline | Deterministic library (sharp); LLM only to flag suspicious EXIF leaks | none | no (deterministic) |
Surface 4 is not an AI call but is included for completeness because the optimizer worker rejects images with embedded GPS / camera-owner metadata in PII-sensitive scopes.
2. The orchestrator port
// application/ports/ai-client.ts
export interface AIClient {
classifyImageSafety(req: ClassifyImageSafetyReq): Promise<ClassifyImageSafetyRes>;
generateAltText(req: GenerateAltTextReq): Promise<GenerateAltTextRes>;
ocrRedact(req: OcrRedactReq): Promise<OcrRedactRes>;
describeExifLeak(req: DescribeExifLeakReq): Promise<DescribeExifLeakRes>;
}
export type ClassifyImageSafetyReq = {
tenantId: TenantId;
fileObjectId: FileObjectId;
signedReadUrl: string; // short-lived; orchestrator fetches bytes
scope: Scope;
preferredLocale: Locale;
};
export type ClassifyImageSafetyRes = {
verdict: 'safe' | 'borderline' | 'unsafe';
categories: { name: 'sexual' | 'violent' | 'hate' | 'illegal' | 'gore' | 'self_harm'; score: number }[];
reason: string | null; // localized
hitlDecisionId: DecisionId | null;
provenance: AIProvenance;
};
The orchestrator returns AIProvenance shaped per @ghasi/contracts-melmastoon:
export type AIProvenance = {
decisionId?: DecisionId; // dec_<ULID>
modelRef: string; // 'vertex/gemini-2.5-flash@2026-04-01'
promptHash: string; // sha256 of (system + user template id)
inputTokens: number;
outputTokens: number;
costMicroUsd: number;
latencyMs: number;
safety: { preCheck: 'pass' | 'fail'; postCheck: 'pass' | 'fail' };
hitl: { required: boolean; status?: 'pending' | 'approved' | 'rejected' };
routing: { tier: 'edge' | 'cloud'; reason: string };
createdAt: string;
};
Persisted as file_objects.ai_provenance jsonb. Missing provenance on an AI-derived attribute fails the MELMASTOON.AI.PROVENANCE_MISSING guard at the ORM layer.
3. Prompt templates
Templates live in services/file-storage-service/contracts/ai/prompts/*.md and are versioned by hash. Each carries a template_id that becomes part of promptHash.
3.1 image-safety.v1
SYSTEM: You are a hotel-asset image-safety classifier. Categorize the image strictly into:
safe — appropriate for a hotel public listing or tenant booking site.
borderline — minor concerns (e.g., partial nudity that may be artistic, ambiguous violence).
unsafe — sexual, gore, hate symbols, illegal activity, child safety violations.
Respond with ONLY a JSON object matching the schema:
{ "verdict": "safe"|"borderline"|"unsafe",
"categories": [{"name": <enum>, "score": <0..1>}],
"reason": <short string in {locale}> }
Hotel context: this image is being uploaded as a property photo for a hotel listing. Bedrooms, bathrooms, lobbies, food, exterior, and pools are expected.
Routing: cloud (Gemini 2.x vision); fallback to edge ONNX safety model on cloud unavailability with a degraded_mode=true hint propagated to the consumer.
3.2 alt-text.v1
SYSTEM: You write a single-sentence accessible alt text for a hotel listing photo.
Constraints:
- Max 140 characters.
- Describe the visual subject: room type, view, time of day, distinctive feature.
- Do NOT mention the brand, the photographer, or non-visible facts.
- Output language: {locale}.
Output ONLY a JSON object: { "altText": <string>, "confidence": <0..1>, "tags": [<short keyword>...] }
Routing: cloud preferred (better captions); edge fallback yields shorter generic captions.
Auto-apply threshold = confidence ≥ 0.75. Below threshold → row is queued in ai-orchestrator-service's HITL queue with decisionRef stamped on the FileObject; staff review in backoffice → on approval the alt text is written via applyAIAltText.
3.3 id-scan-redact.v1
SYSTEM: You receive an ID document scan (passport / national ID / driver licence). Identify regions containing PII fields:
- given name, family name, date of birth, document number, MRZ, machine-readable zone, address, photo, signature.
Output a redaction map: list of bounding boxes (x,y,width,height in pixels) and the field type for each. Do NOT transcribe the values themselves.
Schema: { "boxes": [ { "field": <enum>, "x": int, "y": int, "w": int, "h": int, "confidence": <0..1> } ] }
The redaction map is then applied deterministically by the optimizer worker (sharp + black rectangle composite). The redacted bytes become a new FileObject with aliasOfFileId = original.id and tags += ['redacted','pii_redacted']. The original is hard-purged (subject to legal hold). The downstream reservation-service consumer receives a melmastoon.file.optimization.completed.v1 and updates the Guest.idScanRef to the redacted file id.
Routing: cloud only (Cloud DLP image redaction API + Vertex Gemini multimodal as fallback). Never edge — model size and PII sensitivity preclude on-device.
3.4 exif-leak-describe.v1 (advisory only)
Used to localize a human-readable warning in backoffice when the optimizer detects GPS coordinates or owner metadata in a theme_asset or property_photo. Output is a localized string; no state change.
4. Routing policy (what the orchestrator decides)
The orchestrator chooses model tier via the ai-orchestrator-service policy engine; this service's call sites pass the following hints:
| Surface | Tier preference | Edge fallback acceptable? |
|---|---|---|
| Image safety | cloud | yes (with degraded_mode event attribute; stricter quarantine threshold on edge: borderline → quarantined) |
| Alt-text | cloud | yes (lower-quality acceptable) |
| OCR redact | cloud | no — fail fast, retry next sweep window |
| EXIF describe | cloud | yes (no state change either way) |
A tenant on a low-bandwidth POP that has degraded cloud connectivity gets the edge route automatically; the resulting AIProvenance.routing.tier='edge' makes the choice auditable.
5. Safety controls
- Pre-call safety: orchestrator denies if the request payload itself trips its content classifier (e.g., a malicious filename containing prompt-injection text). Surfaces as
MELMASTOON.AI.REFUSED_SAFETY. - Post-call safety: orchestrator re-runs a safety classifier on the model output (e.g., alt text containing slurs). Surfaces as
MELMASTOON.AI.REFUSED_SAFETYand the alt text is dropped (file remainsreadywithout alt text). - Budget: per-tenant per-purpose budget tracked by orchestrator; exceeded →
MELMASTOON.AI.REFUSED_BUDGET(file remainsready, no alt text or safety verdict; emits adegraded_modelog line). - HITL: for surface 1
borderlineand surface 3 always; HITL queue lives inai-orchestrator-service. The file is held inscanningstatus (gated reads) until HITL resolves; SLO 30 min business-hours, otherwise auto-quarantine after 24 h.
6. Sequence — image safety + alt text on upload
[scan worker] scan.passed.v1
│
▼
[file-storage-service] InternalScanCallbackUseCase
├─ recordScanPassed → status=ready
├─ enqueue ImageOptimizerPort (variants)
└─ AIClient.classifyImageSafety ──┐
│ orchestrator → Gemini vision
▼
verdict + provenance
├─ if 'safe' → AIClient.generateAltText → applyAIAltText if conf>=0.75
├─ if 'borderline'→ holdFile=true; orchestrator opens HITL; stays 'scanning'
└─ if 'unsafe' → recordScanFailed (synthetic ai_safety scanResult); status=quarantined
⤷ emits scan.failed.v1
The synthetic ScanResult row carries scanner='cloud_dlp' and threats=['ai_safety:<category>'] so SIEM treats it identically to a malware quarantine.
7. Failure modes
| Failure | Effect | Detection |
|---|---|---|
| Orchestrator unreachable | image stays ready without alt text or safety verdict; AIProvenance not written; degraded_mode log + dashboard alert | ai_calls_failed_total{purpose='image_safety'} |
MELMASTOON.AI.REFUSED_BUDGET | same — alt text skipped this period | budget dashboard |
| OCR redact fails repeatedly (3 attempts) | original ID scan retained; sweeper re-tries next window; alert at 24 h backlog | ai_redact_failed_total |
| HITL backlog > 24 h | auto-quarantine borderline images | dashboard hitl_oldest_age_seconds |
| Model rolls (modelRef change) | next batch carries new modelRef in AIProvenance — tooling diffs results | nightly drift report |
8. Observability hooks
Per AI call, this service records:
- span attribute
ai.purpose(image_safety/alt_text/ocr_redact/exif_describe) - span attribute
ai.modelRef,ai.cost_micro_usd,ai.tier - counter
file_storage_ai_calls_total{purpose,outcome} - histogram
file_storage_ai_latency_seconds{purpose,tier} - counter
file_storage_ai_hitl_pending(gauge)
Dashboard panels in OBSERVABILITY §4.
9. Privacy boundary with the orchestrator
- For image safety / alt text on
public_mediascopes: the orchestrator may fetch bytes via short-lived signed URL (TTL 60 s), never store them. - For OCR redact on
pii_id_scan: the orchestrator runs in a dedicated PII-cleared project (melmastoon-ai-pii-prod) with stricter logging suppression. Bytes are processed in-memory only; no transient bucket. - All AI calls cross the boundary over mTLS; the orchestrator carries the caller
tenantId+purposein the bearer JWT claims so its budgeting and audit can attribute correctly.
10. Test anchors
application/use-cases/internal-scan-callback.spec.ts— verifies AI call sequencing onscan.passed.v1.application/use-cases/ocr-redact.saga.spec.ts— verifies redaction creates a newFileObjectwithtags=['redacted'], original is hard-purged unless held.infrastructure/ai/ai-client-http.adapter.spec.ts— contract test against orchestrator OpenAPI.domain/file-object/ai-provenance.spec.ts— refuses to apply alt text without provenance.