Skip to main content

AI_INTEGRATION — theme-config-service

Sibling: APPLICATION_LOGIC · DOMAIN_MODEL · SECURITY_MODEL

Platform anchors: docs/08-ai-architecture.md

theme-config-service is an AI-assisted authoring service but never an AI-autonomous service. Every AI surface is HITL-gated: the model drafts; a human approves; the system applies. This document specifies the integration contract, prompt structure, model routing, redaction, provenance, evaluation, and rollback behaviour.


1. AI surfaces

SurfacePurposeTriggerOutputAuto-applied?
Palette suggestionDerive secondary/accent/status tokens from one primary color + brand keywordsPOST /themes/:id/ai-suggest-palettePartial<ColorTokens>No (HITL)
Translation draftingDraft missing locale entries from the default-locale sourcePOST /themes/:id/ai-draft-translationsMap<key, translatedText>No (HITL)
Content draftingDraft a content block in a target locale (e.g. "About us" in ar-SA from en-US source)POST /content-blocks/:id/ai-draftMarkupEntryNo (HITL)
Contrast remediation suggestionWhen a token pair fails WCAG AA, propose an adjusted hex that passesInline editor buttonPartial<ColorTokens>No (HITL)
Layout preset recommendationSuggest a layout preset based on tenant's property type + photosBackoffice "Suggest layout" affordanceLayoutSelectionsNo (HITL)
Copy proofreadingDetect typos, tone inconsistencies, and brand-voice drift in copy stringsInline editor lintSuggestion[]No (HITL)

There is no streaming AI surface in this service — all calls are request/response and bounded by an orchestrator-side budget.


2. AI orchestration

All AI calls are routed through ai-orchestrator-service. theme-config-service does not hold model API keys; it speaks only to the orchestrator over mTLS using its workload identity:

interface AIClient {
suggestPalette(input: SuggestPaletteInput): Promise<{ tokens: Partial<ColorTokens>; provenance: AIProvenance }>;
draftTranslations(input: DraftTranslationsInput): Promise<{ entries: ReadonlyMap<string, string>; provenance: AIProvenance }>;
draftContent(input: DraftContentInput): Promise<{ markup: MarkupEntry; provenance: AIProvenance }>;
suggestContrastFix(input: SuggestContrastFixInput): Promise<{ tokens: Partial<ColorTokens>; provenance: AIProvenance }>;
recommendLayout(input: RecommendLayoutInput): Promise<{ selections: Partial<LayoutSelections>; provenance: AIProvenance }>;
proofreadCopy(input: ProofreadCopyInput): Promise<{ suggestions: ProofreadSuggestion[]; provenance: AIProvenance }>;
}

The orchestrator chooses the model per docs/08-ai-architecture.md §5:

SurfaceDefault modelFallbackRationale
Palette suggestionclaude-3-5-sonnet (latest)gpt-4.1-miniColor reasoning is design-heavy; Sonnet's tool-use is reliable for emitting strict JSON
Translation drafting (high-resource locales: en, fr, ar, fa)gpt-4.1-minigemini-2.0-flashCost-optimized; we still HITL
Translation drafting (low-resource: ps-AF, dialectal Dari)claude-3-5-sonnetlocal Pashto MT model on VertexQuality-critical for our home market
Content draftingclaude-3-5-sonnetgpt-4.1Long-form quality + brand-safe output
Contrast remediationdeterministic algorithm + gpt-4.1-mini for namingnoneAlgorithm is exact; LLM only names the new shade
Layout recommendationgemini-2.0-pro (vision)claude-3-5-sonnet (vision)Vision input from property photos
Copy proofreadinggpt-4.1-miniclaude-3-5-haikuCheap, high-throughput

Routing is the orchestrator's responsibility; this service treats it as a black box.


3. Prompt + response shapes

All prompts are versioned in services/theme-config-service/prompts/ and referenced by the orchestrator via a promptId + promptVersion envelope. Example shapes follow.

3.1 Palette suggestion

System prompt (id: theme.palette.system, v3):

You are a brand designer for a multi-tenant hotel SaaS. Given a primary brand color and optional brand keywords, derive a complete tenant palette that:

  1. Passes WCAG 2.1 AA contrast on every pair documented in the schema.
  2. Respects warm/cool harmony with the primary.
  3. Uses sober status colors (success / warning / error / info) that hotel guests will read as neutral and trustworthy.
  4. Outputs only the JSON object that conforms to the provided schema. Do not explain.

User prompt template:

Primary color: {{primaryColor}}
Brand keywords (optional): {{brandKeywordsJoined}}
Tenant market context (optional, for cultural color reading): {{marketContext}}
Schema: {{schemaSnippet}}

Response constraint: the orchestrator forces JSON-mode (response_format: json_schema) against theme.palette.suggestion.schema.v1.json. Any non-conforming response is rejected at the orchestrator and surfaced as MELMASTOON.AI.OUTPUT_INVALID.

3.2 Translation drafting

System prompt (id: theme.translate.system, v4):

You are translating UI strings for a hotel booking flow. Preserve ICU placeholders ({name}, {count, plural, one {…} other {…}}) exactly. Match the formality the hospitality industry uses in the target locale. Do not add commentary. Output only the JSON map of key → translated string.

User prompt template:

Source locale: {{sourceLocale}}
Target locale: {{targetLocale}}
Brand voice: {{brandVoiceShort}}
Hotel name: {{hotelName}}
Entries:
{{#each keys}}
- key: {{this.key}}
source: {{this.sourceText}}
context: {{this.context}}
{{/each}}

Response is JSON { "<key>": "<translated>" }. Validation: every input key must be present; placeholder set must equal source.

3.3 Content drafting (e.g. about-us in target locale)

System prompt enforces brand-voice + safety rails (no medical claims, no exaggerated guarantees). User prompt carries property context (name, location, amenities, source markup). Output is MarkupEntry; we sanitise post-hoc through the same dompurify allow-list used for human-authored markup.

3.4 Contrast remediation

The deterministic component computes the minimum L* shift on the failing token to reach AA, in the direction (lighter or darker) that preserves the hue best. The LLM only produces a human-readable name like "Deeper Hazara Turquoise" for the new shade.


4. Redaction & data minimisation

Per docs/08-ai-architecture.md §7, only the minimum necessary payload crosses the model trust boundary:

SurfaceSent to modelNever sent to model
Palette suggestionPrimary color, brand keywords, market context (e.g. AF)Tenant ID, tenant name, user ID, IP, full revenue / booking data
Translation draftingLocale codes, source strings, brand voice short, hotel nameTenant ID, guest data, booking data
Content draftingProperty name, location label, amenity short list, source markupGuest data, internal pricing, revenue
Layout recommendationProperty type label, ≤ 6 hero photosPersonally identifying info in photos (faces auto-blurred by file-storage-service before send)
Copy proofreadingLocale, copy strings, brand voice shortTenant ID, user ID, surrounding business data

The orchestrator additionally:

  • Strips request headers (no Authorization, no X-Tenant-Id reach the model).
  • Replaces tenant identifiers with stable opaque tokens for the request span.
  • Logs a redacted prompt hash for audit; the raw prompt is encrypted and held for 30 days for incident response only.

5. Provenance

Every aggregate that carries AI-drafted content stores aiProvenance:

interface AIProvenance {
model: string; // e.g. 'claude-3-5-sonnet-2026-01'
modelProvider: 'anthropic'|'openai'|'google'|'self_hosted';
promptId: string; // 'theme.palette.system'
promptVersion: string; // 'v3'
promptHash: string; // sha256 of the rendered prompt (post-redaction)
responseHash: string; // sha256 of the raw response
tokensIn: number;
tokensOut: number;
latencyMs: number;
costUsd: number; // orchestrator-attributed
createdAt: ISODate;
redactionApplied: boolean;
approverUserId?: UserId; // set on apply
approvedAt?: ISODate;
approverNote?: string;
}

This data is:

  • Persisted on the aggregate (theme_versions.ai_provenance, content_blocks.ai_provenance, locale_packs.ai_provenance).
  • Echoed in the published bundle's meta.aiProvenance for downstream auditability.
  • Surfaced in theme.published.v1.aiProvenance for the audit service.
  • Used by HITL gating: any aggregate with non-null aiProvenance.model and null approverUserId blocks publish.

6. HITL approval flow

Author opens "AI suggest palette"


POST /themes/:id/ai-suggest-palette

▼ draftedTokens + provenance persisted in palette_suggestions (status='pending')

▼ hitlTaskUrl returned


HITL approver opens task in backoffice

▼ reviews drafted tokens; can edit before approving


POST /themes/:id/ai-suggest-palette/:suggestionId/apply
│ { themeVersionId, approverNote }

ApplyAiPaletteSuggestionUseCase
│ merges tokens onto target draft
│ attaches aiProvenance with approverUserId + approvedAt
│ marks suggestion status='approved'

PatchThemeVersionUseCase emits theme.draft_updated.v1

Reject path: POST .../reject sets status='rejected'; suggestion is retained for audit but never applied.

Eligibility: only roles with theme:approve_ai (typically tenant_admin, brand_owner) can call apply. The author of the suggestion cannot approve it (separation of duties).


7. Evaluation harness

Per docs/08-ai-architecture.md §10, every AI surface ships with an offline evaluation harness in services/theme-config-service/evals/:

EvalDataset sizePass criteriaRun frequency
palette.contrast200 primary colors × 5 keyword sets≥ 99 % of generated palettes pass WCAG AA on every documented pairper prompt change + nightly
palette.harmony100 hand-graded palettes≥ 90 % graded "acceptable" or higher by 2/3 designersper prompt change
translate.placeholders500 ICU strings × 8 target locales100 % placeholder parityper prompt change + nightly
translate.fluency200 strings × 4 locales (ps, fa, ar, fr), graded≥ 95 % "fluent" by native graders (sampling)weekly
content.brand_safety50 prompts × 3 sensitive contexts0 unsafe outputsper prompt change
content.hallucination50 prompts (no real amenity for property)0 fabricated amenitiesper prompt change
contrast.fix.minimum_shift200 failing pairsalgorithmic exact match in 100 % of casesper change to deterministic engine
proofread.precision200 strings, hand-labelled≥ 90 % precision @ recall 0.7weekly

Evals run in CI for the prompt directory; failures block deployment of new prompts.


8. Cost & rate budgets

The orchestrator enforces per-tenant monthly budgets (default $20 / month for AI surfaces; configurable per tenant plan). On budget exhaustion, requests return 429 MELMASTOON.AI.BUDGET_EXCEEDED.

Per-request rate limits (in addition to the API rate limits in API_CONTRACTS §7):

SurfacePer-tenant rpmPer-actor rpm
Palette105
Translations3015 (batch up to 200 keys per call)
Content draft105
Contrast fix6030
Layout recommend55
Proofread6030

9. Failure modes

FailureSymptomService behaviour
Orchestrator unreachablegRPC UNAVAILABLEReturn 503 MELMASTOON.AI.UNAVAILABLE with Retry-After
Output schema invalidOrchestrator-side validation failsReturn 502 MELMASTOON.AI.OUTPUT_INVALID; suggestion not persisted
Output unsafe (safety filter)Orchestrator returns safety-blockedReturn 422 MELMASTOON.AI.OUTPUT_BLOCKED; suggestion logged for review
Budget exhaustedOrchestrator returns 429Pass through; surface in UI
Approver attempts to apply unapprovedUse case checks role403 MELMASTOON.AI.HITL_INELIGIBLE_APPROVER
Approver is also authorUse case checks separation of duties403 MELMASTOON.AI.HITL_SAME_ACTOR_FORBIDDEN
Stale suggestion (target draft changed since suggestion was created)Apply detects version mismatch409 MELMASTOON.AI.SUGGESTION_STALE; UI invites refresh

10. Auditability & opt-out

  • Every AI request and every HITL decision is logged to audit-service via theme.ai.suggestion_created.v1 / …approved.v1 / …rejected.v1 (sibling stream of the main theme events).
  • Tenants can opt out of AI assistance at the tenant config level (tenant.config.featureFlags.ai_assistance = false); the AI endpoints then return 403 MELMASTOON.AI.DISABLED_FOR_TENANT.
  • No AI-generated content is automatically included in any published bundle — every byte that ships through the CDN passed a human review.

11. References