AI_INTEGRATION — theme-config-service
Sibling: APPLICATION_LOGIC · DOMAIN_MODEL · SECURITY_MODEL
Platform anchors:
docs/08-ai-architecture.md
theme-config-service is an AI-assisted authoring service but never an AI-autonomous service. Every AI surface is HITL-gated: the model drafts; a human approves; the system applies. This document specifies the integration contract, prompt structure, model routing, redaction, provenance, evaluation, and rollback behaviour.
1. AI surfaces
| Surface | Purpose | Trigger | Output | Auto-applied? |
|---|---|---|---|---|
| Palette suggestion | Derive secondary/accent/status tokens from one primary color + brand keywords | POST /themes/:id/ai-suggest-palette | Partial<ColorTokens> | No (HITL) |
| Translation drafting | Draft missing locale entries from the default-locale source | POST /themes/:id/ai-draft-translations | Map<key, translatedText> | No (HITL) |
| Content drafting | Draft a content block in a target locale (e.g. "About us" in ar-SA from en-US source) | POST /content-blocks/:id/ai-draft | MarkupEntry | No (HITL) |
| Contrast remediation suggestion | When a token pair fails WCAG AA, propose an adjusted hex that passes | Inline editor button | Partial<ColorTokens> | No (HITL) |
| Layout preset recommendation | Suggest a layout preset based on tenant's property type + photos | Backoffice "Suggest layout" affordance | LayoutSelections | No (HITL) |
| Copy proofreading | Detect typos, tone inconsistencies, and brand-voice drift in copy strings | Inline editor lint | Suggestion[] | No (HITL) |
There is no streaming AI surface in this service — all calls are request/response and bounded by an orchestrator-side budget.
2. AI orchestration
All AI calls are routed through ai-orchestrator-service. theme-config-service does not hold model API keys; it speaks only to the orchestrator over mTLS using its workload identity:
interface AIClient {
suggestPalette(input: SuggestPaletteInput): Promise<{ tokens: Partial<ColorTokens>; provenance: AIProvenance }>;
draftTranslations(input: DraftTranslationsInput): Promise<{ entries: ReadonlyMap<string, string>; provenance: AIProvenance }>;
draftContent(input: DraftContentInput): Promise<{ markup: MarkupEntry; provenance: AIProvenance }>;
suggestContrastFix(input: SuggestContrastFixInput): Promise<{ tokens: Partial<ColorTokens>; provenance: AIProvenance }>;
recommendLayout(input: RecommendLayoutInput): Promise<{ selections: Partial<LayoutSelections>; provenance: AIProvenance }>;
proofreadCopy(input: ProofreadCopyInput): Promise<{ suggestions: ProofreadSuggestion[]; provenance: AIProvenance }>;
}
The orchestrator chooses the model per docs/08-ai-architecture.md §5:
| Surface | Default model | Fallback | Rationale |
|---|---|---|---|
| Palette suggestion | claude-3-5-sonnet (latest) | gpt-4.1-mini | Color reasoning is design-heavy; Sonnet's tool-use is reliable for emitting strict JSON |
| Translation drafting (high-resource locales: en, fr, ar, fa) | gpt-4.1-mini | gemini-2.0-flash | Cost-optimized; we still HITL |
| Translation drafting (low-resource: ps-AF, dialectal Dari) | claude-3-5-sonnet | local Pashto MT model on Vertex | Quality-critical for our home market |
| Content drafting | claude-3-5-sonnet | gpt-4.1 | Long-form quality + brand-safe output |
| Contrast remediation | deterministic algorithm + gpt-4.1-mini for naming | none | Algorithm is exact; LLM only names the new shade |
| Layout recommendation | gemini-2.0-pro (vision) | claude-3-5-sonnet (vision) | Vision input from property photos |
| Copy proofreading | gpt-4.1-mini | claude-3-5-haiku | Cheap, high-throughput |
Routing is the orchestrator's responsibility; this service treats it as a black box.
3. Prompt + response shapes
All prompts are versioned in services/theme-config-service/prompts/ and referenced by the orchestrator via a promptId + promptVersion envelope. Example shapes follow.
3.1 Palette suggestion
System prompt (id: theme.palette.system, v3):
You are a brand designer for a multi-tenant hotel SaaS. Given a primary brand color and optional brand keywords, derive a complete tenant palette that:
- Passes WCAG 2.1 AA contrast on every pair documented in the schema.
- Respects warm/cool harmony with the primary.
- Uses sober status colors (success / warning / error / info) that hotel guests will read as neutral and trustworthy.
- Outputs only the JSON object that conforms to the provided schema. Do not explain.
User prompt template:
Primary color: {{primaryColor}}
Brand keywords (optional): {{brandKeywordsJoined}}
Tenant market context (optional, for cultural color reading): {{marketContext}}
Schema: {{schemaSnippet}}
Response constraint: the orchestrator forces JSON-mode (response_format: json_schema) against theme.palette.suggestion.schema.v1.json. Any non-conforming response is rejected at the orchestrator and surfaced as MELMASTOON.AI.OUTPUT_INVALID.
3.2 Translation drafting
System prompt (id: theme.translate.system, v4):
You are translating UI strings for a hotel booking flow. Preserve ICU placeholders ({name}, {count, plural, one {…} other {…}}) exactly. Match the formality the hospitality industry uses in the target locale. Do not add commentary. Output only the JSON map of key → translated string.
User prompt template:
Source locale: {{sourceLocale}}
Target locale: {{targetLocale}}
Brand voice: {{brandVoiceShort}}
Hotel name: {{hotelName}}
Entries:
{{#each keys}}
- key: {{this.key}}
source: {{this.sourceText}}
context: {{this.context}}
{{/each}}
Response is JSON { "<key>": "<translated>" }. Validation: every input key must be present; placeholder set must equal source.
3.3 Content drafting (e.g. about-us in target locale)
System prompt enforces brand-voice + safety rails (no medical claims, no exaggerated guarantees). User prompt carries property context (name, location, amenities, source markup). Output is MarkupEntry; we sanitise post-hoc through the same dompurify allow-list used for human-authored markup.
3.4 Contrast remediation
The deterministic component computes the minimum L* shift on the failing token to reach AA, in the direction (lighter or darker) that preserves the hue best. The LLM only produces a human-readable name like "Deeper Hazara Turquoise" for the new shade.
4. Redaction & data minimisation
Per docs/08-ai-architecture.md §7, only the minimum necessary payload crosses the model trust boundary:
| Surface | Sent to model | Never sent to model |
|---|---|---|
| Palette suggestion | Primary color, brand keywords, market context (e.g. AF) | Tenant ID, tenant name, user ID, IP, full revenue / booking data |
| Translation drafting | Locale codes, source strings, brand voice short, hotel name | Tenant ID, guest data, booking data |
| Content drafting | Property name, location label, amenity short list, source markup | Guest data, internal pricing, revenue |
| Layout recommendation | Property type label, ≤ 6 hero photos | Personally identifying info in photos (faces auto-blurred by file-storage-service before send) |
| Copy proofreading | Locale, copy strings, brand voice short | Tenant ID, user ID, surrounding business data |
The orchestrator additionally:
- Strips request headers (no
Authorization, noX-Tenant-Idreach the model). - Replaces tenant identifiers with stable opaque tokens for the request span.
- Logs a redacted prompt hash for audit; the raw prompt is encrypted and held for 30 days for incident response only.
5. Provenance
Every aggregate that carries AI-drafted content stores aiProvenance:
interface AIProvenance {
model: string; // e.g. 'claude-3-5-sonnet-2026-01'
modelProvider: 'anthropic'|'openai'|'google'|'self_hosted';
promptId: string; // 'theme.palette.system'
promptVersion: string; // 'v3'
promptHash: string; // sha256 of the rendered prompt (post-redaction)
responseHash: string; // sha256 of the raw response
tokensIn: number;
tokensOut: number;
latencyMs: number;
costUsd: number; // orchestrator-attributed
createdAt: ISODate;
redactionApplied: boolean;
approverUserId?: UserId; // set on apply
approvedAt?: ISODate;
approverNote?: string;
}
This data is:
- Persisted on the aggregate (
theme_versions.ai_provenance,content_blocks.ai_provenance,locale_packs.ai_provenance). - Echoed in the published bundle's
meta.aiProvenancefor downstream auditability. - Surfaced in
theme.published.v1.aiProvenancefor the audit service. - Used by HITL gating: any aggregate with non-null
aiProvenance.modeland nullapproverUserIdblocks publish.
6. HITL approval flow
Author opens "AI suggest palette"
│
▼
POST /themes/:id/ai-suggest-palette
│
▼ draftedTokens + provenance persisted in palette_suggestions (status='pending')
│
▼ hitlTaskUrl returned
│
▼
HITL approver opens task in backoffice
│
▼ reviews drafted tokens; can edit before approving
│
▼
POST /themes/:id/ai-suggest-palette/:suggestionId/apply
│ { themeVersionId, approverNote }
▼
ApplyAiPaletteSuggestionUseCase
│ merges tokens onto target draft
│ attaches aiProvenance with approverUserId + approvedAt
│ marks suggestion status='approved'
▼
PatchThemeVersionUseCase emits theme.draft_updated.v1
Reject path: POST .../reject sets status='rejected'; suggestion is retained for audit but never applied.
Eligibility: only roles with theme:approve_ai (typically tenant_admin, brand_owner) can call apply. The author of the suggestion cannot approve it (separation of duties).
7. Evaluation harness
Per docs/08-ai-architecture.md §10, every AI surface ships with an offline evaluation harness in services/theme-config-service/evals/:
| Eval | Dataset size | Pass criteria | Run frequency |
|---|---|---|---|
palette.contrast | 200 primary colors × 5 keyword sets | ≥ 99 % of generated palettes pass WCAG AA on every documented pair | per prompt change + nightly |
palette.harmony | 100 hand-graded palettes | ≥ 90 % graded "acceptable" or higher by 2/3 designers | per prompt change |
translate.placeholders | 500 ICU strings × 8 target locales | 100 % placeholder parity | per prompt change + nightly |
translate.fluency | 200 strings × 4 locales (ps, fa, ar, fr), graded | ≥ 95 % "fluent" by native graders (sampling) | weekly |
content.brand_safety | 50 prompts × 3 sensitive contexts | 0 unsafe outputs | per prompt change |
content.hallucination | 50 prompts (no real amenity for property) | 0 fabricated amenities | per prompt change |
contrast.fix.minimum_shift | 200 failing pairs | algorithmic exact match in 100 % of cases | per change to deterministic engine |
proofread.precision | 200 strings, hand-labelled | ≥ 90 % precision @ recall 0.7 | weekly |
Evals run in CI for the prompt directory; failures block deployment of new prompts.
8. Cost & rate budgets
The orchestrator enforces per-tenant monthly budgets (default $20 / month for AI surfaces; configurable per tenant plan). On budget exhaustion, requests return 429 MELMASTOON.AI.BUDGET_EXCEEDED.
Per-request rate limits (in addition to the API rate limits in API_CONTRACTS §7):
| Surface | Per-tenant rpm | Per-actor rpm |
|---|---|---|
| Palette | 10 | 5 |
| Translations | 30 | 15 (batch up to 200 keys per call) |
| Content draft | 10 | 5 |
| Contrast fix | 60 | 30 |
| Layout recommend | 5 | 5 |
| Proofread | 60 | 30 |
9. Failure modes
| Failure | Symptom | Service behaviour |
|---|---|---|
| Orchestrator unreachable | gRPC UNAVAILABLE | Return 503 MELMASTOON.AI.UNAVAILABLE with Retry-After |
| Output schema invalid | Orchestrator-side validation fails | Return 502 MELMASTOON.AI.OUTPUT_INVALID; suggestion not persisted |
| Output unsafe (safety filter) | Orchestrator returns safety-blocked | Return 422 MELMASTOON.AI.OUTPUT_BLOCKED; suggestion logged for review |
| Budget exhausted | Orchestrator returns 429 | Pass through; surface in UI |
| Approver attempts to apply unapproved | Use case checks role | 403 MELMASTOON.AI.HITL_INELIGIBLE_APPROVER |
| Approver is also author | Use case checks separation of duties | 403 MELMASTOON.AI.HITL_SAME_ACTOR_FORBIDDEN |
| Stale suggestion (target draft changed since suggestion was created) | Apply detects version mismatch | 409 MELMASTOON.AI.SUGGESTION_STALE; UI invites refresh |
10. Auditability & opt-out
- Every AI request and every HITL decision is logged to
audit-serviceviatheme.ai.suggestion_created.v1/…approved.v1/…rejected.v1(sibling stream of the main theme events). - Tenants can opt out of AI assistance at the tenant config level (
tenant.config.featureFlags.ai_assistance = false); the AI endpoints then return403 MELMASTOON.AI.DISABLED_FOR_TENANT. - No AI-generated content is automatically included in any published bundle — every byte that ships through the CDN passed a human review.
11. References
- Platform AI architecture:
docs/08-ai-architecture.md - Use cases:
APPLICATION_LOGIC §2.11–2.12 - Provenance shape:
DOMAIN_MODEL §6.2 - HITL RBAC:
SECURITY_MODEL