ai-orchestrator-service — Domain Model
Companion to:
SERVICE_OVERVIEW.md·APPLICATION_LOGIC.md·DATA_MODEL.md· Canonical AI thesis: 08 AI Architecture
The domain layer must not import NestJS, Drizzle, the Pub/Sub client, fetch, or any provider SDK. All types below live under src/domain/. Branded IDs follow @ghasi/domain-primitives conventions.
1. Aggregates
| Aggregate | Identity | Invariants | Lifecycle |
|---|---|---|---|
Capability | cap_<ulid> | Bound to exactly one active PromptVersion; declared default Model exists in catalog; fallback chain non-empty; HITL config present (even if disabled) | draft → active → deprecated → retired |
Prompt | prm_<ulid> | One active PromptVersion per (domain, ordinal); domain is one of the canonical domain codes | created with first draft version |
PromptVersion | pmv_<ulid> | Immutable once created; references its EvalSuite; output_schema is valid JSON Schema; never overwrites prior versions | draft → active → deprecated → retired |
Model | mdl_<ulid> | Provider × model unique; modality + cost class + latency class declared; deprecation requires migration plan | available → deprecated → retired |
ModelDeployment | mdp_<ulid> | Per-region; traffic share ∈ [0, 100]%; sum of traffic shares per (model, region) ≤ 100%; target latency declared | pending → active → draining → retired |
Provider | prv_<ulid> | One row per `('vertex' | 'anthropic' |
InferenceRequest | ifr_<ulid> | Carries tenantId, capabilityId, inputHash (PII-redacted); never stores raw user content unless redacted | append-only |
InferenceResult | ifs_<ulid> | 1:1 with InferenceRequest; carries provenanceId and structured output | append-only |
Provenance | prv_p_<ulid> | All fields populated; cacheHit implies tokensIn = 0, tokensOut = 0, costUsd = 0; local: true implies provider = onnx-edge | append-only |
EvalSuite | eva_<ulid> | Versioned in companion repo melmastoon-ai-evals; non-empty golden set; declared scoring rubric | created → updated (versioned) |
EvalRun | evr_<ulid> | Bound to (EvalSuite, PromptVersion, Model); emits eval.run_completed on finish; scores within rubric range | queued → running → completed → failed |
RAGCorpus | rag_<ulid> | Tenant-scoped namespace; declared chunk strategy; declared embedding model | provisioning → active → archived |
Embedding | composite (corpusId, chunkId) | 768-dim (cloud) or 384-dim (edge); orphaned chunks flagged; tenant_id matches corpus | append + replace-on-reingest |
BudgetCounter | bdg_<ulid> | One per (tenantId, scope, periodKey); tokensUsed ≤ tokensCap enforced at increment time; soft-cap warning fired exactly once per period | reset at period boundary |
HitlGate | hgt_<ulid> | Open until decided or timed_out; SLA timer non-negative; reviewer role asserted from iam-service claims | open → decided → closed or open → timed_out → closed |
HitlDecision | dec_<ulid> | Linked 1:1 with a HitlGate; outcome ∈ {accepted, modified, rejected}; rejection requires justification | append-only |
EdgeModelManifest | emm_<ulid> | Signed by KMS; non-empty model list; SHA-256 per model; semver bumped on every change | draft → published → superseded |
2. Value Objects
import type { Branded } from '@ghasi/domain-primitives';
export type CapabilityId = Branded<string, 'CapabilityId'>; // 'cap_01H...'
export type PromptId = Branded<string, 'PromptId'>; // 'prm_01H...'
export type PromptVersionId = Branded<string, 'PromptVersionId'>; // 'pmv_01H...'
export type ModelId = Branded<string, 'ModelId'>; // 'mdl_01H...'
export type ProviderId = Branded<string, 'ProviderId'>; // 'prv_01H...'
export type InferenceRequestId = Branded<string, 'InferenceRequestId'>; // 'ifr_01H...'
export type InferenceResultId = Branded<string, 'InferenceResultId'>; // 'ifs_01H...'
export type ProvenanceId = Branded<string, 'ProvenanceId'>; // 'prv_p_01H...'
export type EvalSuiteId = Branded<string, 'EvalSuiteId'>; // 'eva_01H...'
export type EvalRunId = Branded<string, 'EvalRunId'>; // 'evr_01H...'
export type RAGCorpusId = Branded<string, 'RAGCorpusId'>; // 'rag_01H...'
export type BudgetCounterId = Branded<string, 'BudgetCounterId'>; // 'bdg_01H...'
export type HitlGateId = Branded<string, 'HitlGateId'>; // 'hgt_01H...'
export type HitlDecisionId = Branded<string, 'HitlDecisionId'>; // 'dec_01H...'
export type EdgeManifestId = Branded<string, 'EdgeManifestId'>; // 'emm_01H...'
export type ProviderName = 'vertex' | 'anthropic' | 'openai' | 'onnx-edge';
export type Modality = 'llm' | 'embedding' | 'vision' | 'speech' | 'classifier';
export type LatencyClass = 'very_low' | 'low' | 'medium' | 'high'; // mapped to numeric p95 budget
export type CostClass = 'free' | 'very_low' | 'low' | 'medium' | 'high';
export type CapabilityStatus = 'draft' | 'active' | 'deprecated' | 'retired';
export type PromptVersionStatus = 'draft' | 'active' | 'deprecated' | 'retired';
export type ModelStatus = 'available' | 'deprecated' | 'retired';
export type ProviderHealth = 'healthy' | 'degraded' | 'unhealthy' | 'recovering';
export type HitlGateStatus = 'open' | 'decided' | 'timed_out' | 'closed';
export type HitlOutcome = 'accepted' | 'modified' | 'rejected';
export type SafetyVerdict = 'pass' | 'flag_low' | 'flag_high' | 'block';
export type DegradationReason = 'budget_hard_cap' | 'all_providers_unhealthy' | 'edge_signature_invalid' | 'moderation_block' | 'schema_invalid';
export interface PromptDomain { // canonical codes
readonly value:
| 'PRICING' | 'HK' | 'STAFF' | 'ANOMALY' | 'UPSELL'
| 'MSG' | 'REVIEW' | 'BOOKING' | 'TUTOR' | 'DESC'
| 'TRANSLATE' | 'OCR' | 'STT' | 'VISION';
}
export interface PromptIdentity { // 'PRMP_PRICING_001_v3' canonical form
readonly domain: PromptDomain['value'];
readonly ordinal: number; // zero-padded to 3 in display
readonly version: number; // semver-major
}
export interface ModelRef { // human-readable, used in provenance
readonly provider: ProviderName;
readonly name: string; // 'gemini-1.5-flash' / 'phi-3-mini-4k-instruct' / 'all-MiniLM-L6-v2'
readonly version?: string;
}
export interface FallbackChain {
readonly steps: readonly ModelRef[]; // ordered; first = default; last = deterministic fallback
}
export interface HitlConfig {
readonly required: boolean;
readonly trigger?: // when `required`, when does the gate open?
| { kind: 'always' }
| { kind: 'threshold'; field: string; comparator: 'gt' | 'lt' | 'gte' | 'lte' | 'eq'; value: number }
| { kind: 'risk_score'; min: number };
readonly slaSeconds: number; // gate auto-closes on timeout
readonly defaultOnTimeout: HitlOutcome; // 'rejected' is the conservative default
readonly reviewerRoles: readonly string[]; // RBAC role names from tenant-service
}
export interface CostUsd { // money in micros (1 USD = 1_000_000 micros)
readonly micros: number; // never floats
}
export interface TokenCount {
readonly input: number;
readonly output: number;
}
export interface InputHash { // SHA-256 of redacted input + capability + promptVersion + tenantId
readonly value: string; // 'sha256:...'
}
export interface AIProvenance { // mirrors docs/08 §6
readonly id: ProvenanceId;
readonly promptId: PromptVersionId;
readonly promptVersion: number;
readonly model: ModelRef;
readonly traceId: string; // W3C traceparent
readonly occurredAt: string; // ISO-8601
readonly tokens: TokenCount;
readonly cost: CostUsd;
readonly local: boolean;
readonly cacheHit: boolean;
readonly safety: { input: SafetyVerdict; output: SafetyVerdict };
readonly reviewedBy?: string; // usr_...
readonly reviewedAt?: string;
readonly decision?: HitlOutcome;
}
3. Aggregate — Capability
export interface Capability {
readonly id: CapabilityId;
readonly key: string; // 'pricing.suggest'
readonly displayName: string;
readonly status: CapabilityStatus;
readonly defaultPromptVersionId?: PromptVersionId; // null until first activation
readonly defaultModel: ModelRef;
readonly fallbackChain: FallbackChain;
readonly latencyClass: LatencyClass;
readonly costClass: CostClass;
readonly outputSchemaJson: unknown; // JSON Schema
readonly hitl: HitlConfig;
readonly evalSuiteId: EvalSuiteId;
readonly cacheTtlSeconds: number | null; // null disables cache
readonly tenantOptOutAllowed: boolean; // some capabilities are mandatory (e.g., moderation)
readonly version: number; // optimistic concurrency
readonly createdAt: string;
readonly updatedAt: string;
}
Invariants:
status === 'active'⇒defaultPromptVersionIdis non-null AND points to aPromptVersionwhosestatus === 'active'.fallbackChain.stepsincludes a deterministic terminal step (cost = free) for hard-cap degradation.outputSchemaJsonMUST validate the empty object case (used by deterministic fallback when no structured output is required).
4. Aggregate — Prompt and PromptVersion
export interface Prompt {
readonly id: PromptId;
readonly domain: PromptDomain['value'];
readonly ordinal: number;
readonly displayName: string;
readonly ownerUserId: string; // 'usr_...'
readonly capabilityKey: string; // 'pricing.suggest'
readonly activeVersionId: PromptVersionId | null;
readonly createdAt: string;
}
export interface PromptVersion {
readonly id: PromptVersionId;
readonly promptId: PromptId;
readonly version: number; // increments per draft
readonly canonicalCode: string; // 'PRMP_PRICING_001_v3' — derived
readonly status: PromptVersionStatus;
readonly systemPrompt: string; // assembled centrally; never composed from user input
readonly userTemplate: string; // mustache-style placeholders
readonly outputSchemaJson: unknown; // JSON Schema
readonly defaultModel: ModelRef;
readonly evalSuiteId: EvalSuiteId;
readonly notes?: string;
readonly createdAt: string;
readonly activatedAt?: string;
readonly deprecatedAt?: string; // must be ≥ 14 days before retirement
readonly retiredAt?: string;
}
Lifecycle rules:
- New version is created in
draft. Cannot be served by production traffic. - Promotion
draft → activerequires greenEvalRunagainst the currentactiveand ≥ 7 days of A/B traffic at 5%. - The previously
activerow flips todeprecated. After ≥ 14 days, it can beretired.retiredrows are kept for audit but never served.
5. Aggregate — Model and ModelDeployment
export interface Model {
readonly id: ModelId;
readonly ref: ModelRef;
readonly modality: Modality;
readonly contextWindowTokens: number | null; // null for non-LLM
readonly costClass: CostClass;
readonly latencyClass: LatencyClass;
readonly status: ModelStatus;
readonly perTokenCostMicrosIn?: number; // billing factor (cloud only)
readonly perTokenCostMicrosOut?: number;
readonly perCallCostMicros?: number; // for vision / OCR / STT pricing
readonly notes?: string;
readonly addedAt: string;
}
export interface ModelDeployment {
readonly id: ModelDeploymentId;
readonly modelId: ModelId;
readonly region: string; // 'me-central1' | 'europe-west4' | ...
readonly trafficSharePct: number; // 0..100
readonly status: 'pending' | 'active' | 'draining' | 'retired';
readonly activatedAt?: string;
readonly drainStartedAt?: string;
readonly notes?: string;
}
6. Aggregate — Provider
export interface Provider {
readonly id: ProviderId;
readonly name: ProviderName;
readonly health: ProviderHealth;
readonly consecutiveErrors: number; // resets on success
readonly lastErrorAt?: string;
readonly lastSuccessAt?: string;
readonly circuitOpenedAt?: string; // when health → 'unhealthy'
readonly probeIntervalMs: number; // half-open probe cadence
readonly notes?: string;
}
Circuit-breaker rules:
consecutiveErrors >= 5⇒ health →unhealthy; circuit opened.- After
probeIntervalMs, the next call is a probe; success →recovering→healthy; failure → reset timer. - Health changes emit
melmastoon.ai_orchestrator.model.deployment_changed.v1with{ provider, healthBefore, healthAfter }.
7. Aggregate — InferenceRequest, InferenceResult, Provenance
export interface InferenceRequest {
readonly id: InferenceRequestId;
readonly tenantId: string;
readonly capabilityKey: string;
readonly promptVersionId: PromptVersionId | null; // null when no prompt (e.g., embedding, vision)
readonly inputHash: InputHash;
readonly inputBytes: number; // pre-redaction byte size
readonly redactedInputHash: InputHash; // post-redaction
readonly correlation: { traceId: string; requestId: string };
readonly receivedAt: string;
readonly callerService: string; // 'pricing-service'
readonly callerSurface?: 'consumer' | 'tenant-booking' | 'backoffice';
readonly tenantRegionPin?: string;
}
export interface InferenceResult {
readonly id: InferenceResultId;
readonly requestId: InferenceRequestId;
readonly status: 'completed' | 'failed' | 'fallback_deterministic';
readonly provenanceId: ProvenanceId;
readonly outputJson?: unknown; // schema-validated structured output
readonly errorCode?: string; // 'MELMASTOON.AI.PROVIDER_UNAVAILABLE' etc.
readonly hitlGateId?: HitlGateId;
readonly cached: boolean;
readonly latencyMs: number;
readonly completedAt: string;
}
Provenance is reproduced in §2; persisted via provenances table (see DATA_MODEL.md).
8. Aggregate — BudgetCounter
export interface BudgetCounter {
readonly id: BudgetCounterId;
readonly tenantId: string;
readonly scope: // hierarchical
| { kind: 'tenant_total' }
| { kind: 'capability'; capabilityKey: string }
| { kind: 'feature'; featureKey: string };
readonly periodKey: string; // 'YYYY-MM' for monthly; 'YYYY-MM-DD' for daily
readonly tokensUsed: number;
readonly costMicrosUsed: number;
readonly tokensCap: number;
readonly costMicrosCap: number;
readonly softCapPct: number; // default 80
readonly hardCapPct: number; // default 100
readonly softCapWarnedAt?: string; // emitted exactly once per period
readonly hardCapTrippedAt?: string;
readonly resetsAt: string;
}
Increment rules:
- Atomic UPDATE returning the new used totals; if the increment would cross the soft cap, emit
budget.warning.v1; if it would exceed the hard cap, the call routes to deterministic fallback andbudget.exceeded.v1is emitted. - Cache hits do not increment tokens or cost.
9. Aggregate — HitlGate and HitlDecision
export interface HitlGate {
readonly id: HitlGateId;
readonly tenantId: string;
readonly capabilityKey: string;
readonly artifactRef: { kind: string; id: string }; // e.g., { kind: 'pricing-suggestion', id: 'ifs_...' }
readonly draftJson: unknown; // the AI artifact awaiting decision
readonly status: HitlGateStatus;
readonly openedAt: string;
readonly slaDeadline: string;
readonly reviewerRoles: readonly string[];
readonly notificationsSent: number;
readonly decisionId?: HitlDecisionId;
readonly closedAt?: string;
readonly correlation: { traceId: string; requestId: string };
}
export interface HitlDecision {
readonly id: HitlDecisionId;
readonly gateId: HitlGateId;
readonly outcome: HitlOutcome;
readonly modifiedJson?: unknown; // present when outcome === 'modified'
readonly justification?: string; // required on 'rejected'
readonly reviewerUserId: string; // 'usr_...'
readonly reviewerRole: string;
readonly decidedAt: string;
readonly auto: boolean; // true on timeout-default
}
State machine:
open ──── decision submitted ────▶ decided ────▶ closed
│
└── slaDeadline passed ────▶ timed_out ────▶ closed (auto: true, outcome: defaultOnTimeout)
10. Aggregate — RAGCorpus and Embedding
export interface RAGCorpus {
readonly id: RAGCorpusId;
readonly tenantId: string;
readonly namespace: string; // 'policies' | 'faq' | 'sop' | 'amenity'
readonly chunkStrategy: { method: 'fixed' | 'semantic'; targetTokens: number; overlap: number };
readonly embeddingModel: ModelRef; // text-embedding-004 (cloud) or all-MiniLM-L6-v2 (edge)
readonly status: 'provisioning' | 'active' | 'archived';
readonly chunkCount: number;
readonly lastReindexAt?: string;
readonly createdAt: string;
}
export interface Embedding {
readonly corpusId: RAGCorpusId;
readonly chunkId: string; // ULID
readonly tenantId: string;
readonly vector: Float32Array; // 768 (cloud) or 384 (edge)
readonly chunkText: string; // PII-redacted; original kept in source bucket
readonly metadata: Record<string, string | number | boolean>;
readonly sourceUri?: string;
readonly createdAt: string;
}
11. Aggregate — EdgeModelManifest
export interface EdgeModelEntry {
readonly modelKey: string; // 'phi-3-mini-4k-instruct'
readonly fileName: string; // 'phi-3-mini-int4.onnx'
readonly sha256: string;
readonly bytes: number;
readonly minRamMb: number;
readonly idleUnloadMinutes: number;
readonly capabilities: readonly string[]; // capability keys this model serves
}
export interface EdgeModelManifest {
readonly id: EdgeManifestId;
readonly version: string; // semver, e.g. '2.4.1'
readonly status: 'draft' | 'published' | 'superseded';
readonly models: readonly EdgeModelEntry[];
readonly signature: { kmsKeyId: string; algorithm: 'RSASSA_PSS_SHA_256'; valueB64: string };
readonly publishedAt?: string;
readonly supersededAt?: string;
readonly supersedesId?: EdgeManifestId;
}
Verification rule (enforced both at Electron load time and inside the gateway when re-publishing): verifyKmsSignature(canonicalize(models, version), signature) MUST succeed.
12. Domain Events (catalog summary; full schemas in EVENT_SCHEMAS.md)
| Event | Aggregate | Trigger |
|---|---|---|
melmastoon.ai_orchestrator.inference.requested.v1 | InferenceRequest | accepted call |
melmastoon.ai_orchestrator.inference.completed.v1 | InferenceResult | structured output returned |
melmastoon.ai_orchestrator.inference.failed.v1 | InferenceResult | fallback chain exhausted |
melmastoon.ai_orchestrator.inference.cached_hit.v1 | InferenceResult | cache hit |
melmastoon.ai_orchestrator.suggestion.dynamic_pricing.v1 | capability-specific | pricing.suggest produced |
melmastoon.ai_orchestrator.suggestion.demand_forecast.v1 | capability-specific | pricing.demand_forecast produced |
melmastoon.ai_orchestrator.suggestion.housekeeping_routing.v1 | capability-specific | housekeeping.route produced |
melmastoon.ai_orchestrator.suggestion.shift_optimization.v1 | capability-specific | staff.shift_optimize produced |
melmastoon.ai_orchestrator.anomaly.detected.v1 | capability-specific | anomaly.detect flagged |
melmastoon.ai_orchestrator.upsell.recommended.v1 | capability-specific | upsell.recommend produced |
melmastoon.ai_orchestrator.message.drafted.v1 | capability-specific + HITL | always opens HITL gate |
melmastoon.ai_orchestrator.review.summarized.v1 | capability-specific | review.summarize produced |
melmastoon.ai_orchestrator.ocr.completed.v1 | capability-specific + HITL | always opens HITL gate |
melmastoon.ai_orchestrator.transcription.completed.v1 | capability-specific | stt.transcribe produced |
melmastoon.ai_orchestrator.description.drafted.v1 | capability-specific + HITL | tenant accepts before publish |
melmastoon.ai_orchestrator.translation.drafted.v1 | capability-specific + HITL | tenant per-locale review |
melmastoon.ai_orchestrator.hitl.gate_opened.v1 | HitlGate | gate created |
melmastoon.ai_orchestrator.hitl.gate_decided.v1 | HitlDecision | decision recorded |
melmastoon.ai_orchestrator.budget.warning.v1 | BudgetCounter | soft cap crossed |
melmastoon.ai_orchestrator.budget.exceeded.v1 | BudgetCounter | hard cap crossed |
melmastoon.ai_orchestrator.eval.run_completed.v1 | EvalRun | eval finished |
melmastoon.ai_orchestrator.prompt.version_published.v1 | PromptVersion | promoted to active |
melmastoon.ai_orchestrator.model.deployment_changed.v1 | ModelDeployment / Provider | traffic shift / health change |
melmastoon.ai_orchestrator.edge_model.manifest_updated.v1 | EdgeModelManifest | published |
melmastoon.ai_orchestrator.moderation.flagged.v1 | InferenceResult | pre or post moderation block |
13. Domain Errors
| Error | Surface as |
|---|---|
CrossTenantReferenceError | MELMASTOON.GENERAL.CROSS_TENANT_REFERENCE (any aggregate constructed from a different tenant's primitive) |
BudgetExceededError | MELMASTOON.AI.REFUSED_BUDGET |
SafetyRefusedError | MELMASTOON.AI.REFUSED_SAFETY |
OutputSchemaInvalidError | MELMASTOON.AI.OUTPUT_INVALID (after one repair attempt) |
ProviderUnavailableError | MELMASTOON.AI.PROVIDER_UNAVAILABLE (after fallback chain exhausted) |
HitlRequiredError | MELMASTOON.AI.HITL_REQUIRED (caller attempted to bypass gate) |
ProvenanceMissingError | MELMASTOON.AI.PROVENANCE_MISSING (defensive; should never reach a sibling service) |
EdgeManifestSignatureInvalidError | desktop-side; refuses load |
PromptInjectionDetectedError | absorbed into MELMASTOON.AI.REFUSED_SAFETY with detail: 'injection_pattern' |
RAGCrossTenantError | MELMASTOON.GENERAL.CROSS_TENANT_REFERENCE |
14. Cross-Aggregate Invariants
- Every
InferenceResulthas aProvenance. There is no path to persist a result without one. - Every
HitlDecisionreferences exactly oneHitlGate. The reverse is enforced by FK + uniqueness. - Every
Capabilityactiverow has at least onePromptVersionactive. - Every
Embeddingrow carriestenantIdmatching its parentRAGCorpus.tenantId. Both are RLS-enforced; the domain layer also asserts the match. - Every
EvalRunis bound to one(EvalSuite, PromptVersion, Model)tuple. BudgetCounterincrements are idempotent on(tenantId, scope, periodKey, requestId).EdgeModelManifest.signatureMUST verify against the KMS public key embedded in the Electron binary at load time.