Skip to main content

ai-orchestrator-service — Domain Model

Companion to: SERVICE_OVERVIEW.md · APPLICATION_LOGIC.md · DATA_MODEL.md · Canonical AI thesis: 08 AI Architecture

The domain layer must not import NestJS, Drizzle, the Pub/Sub client, fetch, or any provider SDK. All types below live under src/domain/. Branded IDs follow @ghasi/domain-primitives conventions.

1. Aggregates

AggregateIdentityInvariantsLifecycle
Capabilitycap_<ulid>Bound to exactly one active PromptVersion; declared default Model exists in catalog; fallback chain non-empty; HITL config present (even if disabled)draft → active → deprecated → retired
Promptprm_<ulid>One active PromptVersion per (domain, ordinal); domain is one of the canonical domain codescreated with first draft version
PromptVersionpmv_<ulid>Immutable once created; references its EvalSuite; output_schema is valid JSON Schema; never overwrites prior versionsdraft → active → deprecated → retired
Modelmdl_<ulid>Provider × model unique; modality + cost class + latency class declared; deprecation requires migration planavailable → deprecated → retired
ModelDeploymentmdp_<ulid>Per-region; traffic share ∈ [0, 100]%; sum of traffic shares per (model, region) ≤ 100%; target latency declaredpending → active → draining → retired
Providerprv_<ulid>One row per `('vertex''anthropic'
InferenceRequestifr_<ulid>Carries tenantId, capabilityId, inputHash (PII-redacted); never stores raw user content unless redactedappend-only
InferenceResultifs_<ulid>1:1 with InferenceRequest; carries provenanceId and structured outputappend-only
Provenanceprv_p_<ulid>All fields populated; cacheHit implies tokensIn = 0, tokensOut = 0, costUsd = 0; local: true implies provider = onnx-edgeappend-only
EvalSuiteeva_<ulid>Versioned in companion repo melmastoon-ai-evals; non-empty golden set; declared scoring rubriccreated → updated (versioned)
EvalRunevr_<ulid>Bound to (EvalSuite, PromptVersion, Model); emits eval.run_completed on finish; scores within rubric rangequeued → running → completed → failed
RAGCorpusrag_<ulid>Tenant-scoped namespace; declared chunk strategy; declared embedding modelprovisioning → active → archived
Embeddingcomposite (corpusId, chunkId)768-dim (cloud) or 384-dim (edge); orphaned chunks flagged; tenant_id matches corpusappend + replace-on-reingest
BudgetCounterbdg_<ulid>One per (tenantId, scope, periodKey); tokensUsed ≤ tokensCap enforced at increment time; soft-cap warning fired exactly once per periodreset at period boundary
HitlGatehgt_<ulid>Open until decided or timed_out; SLA timer non-negative; reviewer role asserted from iam-service claimsopen → decided → closed or open → timed_out → closed
HitlDecisiondec_<ulid>Linked 1:1 with a HitlGate; outcome ∈ {accepted, modified, rejected}; rejection requires justificationappend-only
EdgeModelManifestemm_<ulid>Signed by KMS; non-empty model list; SHA-256 per model; semver bumped on every changedraft → published → superseded

2. Value Objects

import type { Branded } from '@ghasi/domain-primitives';

export type CapabilityId = Branded<string, 'CapabilityId'>; // 'cap_01H...'
export type PromptId = Branded<string, 'PromptId'>; // 'prm_01H...'
export type PromptVersionId = Branded<string, 'PromptVersionId'>; // 'pmv_01H...'
export type ModelId = Branded<string, 'ModelId'>; // 'mdl_01H...'
export type ProviderId = Branded<string, 'ProviderId'>; // 'prv_01H...'
export type InferenceRequestId = Branded<string, 'InferenceRequestId'>; // 'ifr_01H...'
export type InferenceResultId = Branded<string, 'InferenceResultId'>; // 'ifs_01H...'
export type ProvenanceId = Branded<string, 'ProvenanceId'>; // 'prv_p_01H...'
export type EvalSuiteId = Branded<string, 'EvalSuiteId'>; // 'eva_01H...'
export type EvalRunId = Branded<string, 'EvalRunId'>; // 'evr_01H...'
export type RAGCorpusId = Branded<string, 'RAGCorpusId'>; // 'rag_01H...'
export type BudgetCounterId = Branded<string, 'BudgetCounterId'>; // 'bdg_01H...'
export type HitlGateId = Branded<string, 'HitlGateId'>; // 'hgt_01H...'
export type HitlDecisionId = Branded<string, 'HitlDecisionId'>; // 'dec_01H...'
export type EdgeManifestId = Branded<string, 'EdgeManifestId'>; // 'emm_01H...'

export type ProviderName = 'vertex' | 'anthropic' | 'openai' | 'onnx-edge';
export type Modality = 'llm' | 'embedding' | 'vision' | 'speech' | 'classifier';
export type LatencyClass = 'very_low' | 'low' | 'medium' | 'high'; // mapped to numeric p95 budget
export type CostClass = 'free' | 'very_low' | 'low' | 'medium' | 'high';
export type CapabilityStatus = 'draft' | 'active' | 'deprecated' | 'retired';
export type PromptVersionStatus = 'draft' | 'active' | 'deprecated' | 'retired';
export type ModelStatus = 'available' | 'deprecated' | 'retired';
export type ProviderHealth = 'healthy' | 'degraded' | 'unhealthy' | 'recovering';
export type HitlGateStatus = 'open' | 'decided' | 'timed_out' | 'closed';
export type HitlOutcome = 'accepted' | 'modified' | 'rejected';
export type SafetyVerdict = 'pass' | 'flag_low' | 'flag_high' | 'block';
export type DegradationReason = 'budget_hard_cap' | 'all_providers_unhealthy' | 'edge_signature_invalid' | 'moderation_block' | 'schema_invalid';

export interface PromptDomain { // canonical codes
readonly value:
| 'PRICING' | 'HK' | 'STAFF' | 'ANOMALY' | 'UPSELL'
| 'MSG' | 'REVIEW' | 'BOOKING' | 'TUTOR' | 'DESC'
| 'TRANSLATE' | 'OCR' | 'STT' | 'VISION';
}

export interface PromptIdentity { // 'PRMP_PRICING_001_v3' canonical form
readonly domain: PromptDomain['value'];
readonly ordinal: number; // zero-padded to 3 in display
readonly version: number; // semver-major
}

export interface ModelRef { // human-readable, used in provenance
readonly provider: ProviderName;
readonly name: string; // 'gemini-1.5-flash' / 'phi-3-mini-4k-instruct' / 'all-MiniLM-L6-v2'
readonly version?: string;
}

export interface FallbackChain {
readonly steps: readonly ModelRef[]; // ordered; first = default; last = deterministic fallback
}

export interface HitlConfig {
readonly required: boolean;
readonly trigger?: // when `required`, when does the gate open?
| { kind: 'always' }
| { kind: 'threshold'; field: string; comparator: 'gt' | 'lt' | 'gte' | 'lte' | 'eq'; value: number }
| { kind: 'risk_score'; min: number };
readonly slaSeconds: number; // gate auto-closes on timeout
readonly defaultOnTimeout: HitlOutcome; // 'rejected' is the conservative default
readonly reviewerRoles: readonly string[]; // RBAC role names from tenant-service
}

export interface CostUsd { // money in micros (1 USD = 1_000_000 micros)
readonly micros: number; // never floats
}

export interface TokenCount {
readonly input: number;
readonly output: number;
}

export interface InputHash { // SHA-256 of redacted input + capability + promptVersion + tenantId
readonly value: string; // 'sha256:...'
}

export interface AIProvenance { // mirrors docs/08 §6
readonly id: ProvenanceId;
readonly promptId: PromptVersionId;
readonly promptVersion: number;
readonly model: ModelRef;
readonly traceId: string; // W3C traceparent
readonly occurredAt: string; // ISO-8601
readonly tokens: TokenCount;
readonly cost: CostUsd;
readonly local: boolean;
readonly cacheHit: boolean;
readonly safety: { input: SafetyVerdict; output: SafetyVerdict };
readonly reviewedBy?: string; // usr_...
readonly reviewedAt?: string;
readonly decision?: HitlOutcome;
}

3. Aggregate — Capability

export interface Capability {
readonly id: CapabilityId;
readonly key: string; // 'pricing.suggest'
readonly displayName: string;
readonly status: CapabilityStatus;
readonly defaultPromptVersionId?: PromptVersionId; // null until first activation
readonly defaultModel: ModelRef;
readonly fallbackChain: FallbackChain;
readonly latencyClass: LatencyClass;
readonly costClass: CostClass;
readonly outputSchemaJson: unknown; // JSON Schema
readonly hitl: HitlConfig;
readonly evalSuiteId: EvalSuiteId;
readonly cacheTtlSeconds: number | null; // null disables cache
readonly tenantOptOutAllowed: boolean; // some capabilities are mandatory (e.g., moderation)
readonly version: number; // optimistic concurrency
readonly createdAt: string;
readonly updatedAt: string;
}

Invariants:

  • status === 'active'defaultPromptVersionId is non-null AND points to a PromptVersion whose status === 'active'.
  • fallbackChain.steps includes a deterministic terminal step (cost = free) for hard-cap degradation.
  • outputSchemaJson MUST validate the empty object case (used by deterministic fallback when no structured output is required).

4. Aggregate — Prompt and PromptVersion

export interface Prompt {
readonly id: PromptId;
readonly domain: PromptDomain['value'];
readonly ordinal: number;
readonly displayName: string;
readonly ownerUserId: string; // 'usr_...'
readonly capabilityKey: string; // 'pricing.suggest'
readonly activeVersionId: PromptVersionId | null;
readonly createdAt: string;
}

export interface PromptVersion {
readonly id: PromptVersionId;
readonly promptId: PromptId;
readonly version: number; // increments per draft
readonly canonicalCode: string; // 'PRMP_PRICING_001_v3' — derived
readonly status: PromptVersionStatus;
readonly systemPrompt: string; // assembled centrally; never composed from user input
readonly userTemplate: string; // mustache-style placeholders
readonly outputSchemaJson: unknown; // JSON Schema
readonly defaultModel: ModelRef;
readonly evalSuiteId: EvalSuiteId;
readonly notes?: string;
readonly createdAt: string;
readonly activatedAt?: string;
readonly deprecatedAt?: string; // must be ≥ 14 days before retirement
readonly retiredAt?: string;
}

Lifecycle rules:

  • New version is created in draft. Cannot be served by production traffic.
  • Promotion draft → active requires green EvalRun against the current active and ≥ 7 days of A/B traffic at 5%.
  • The previously active row flips to deprecated. After ≥ 14 days, it can be retired. retired rows are kept for audit but never served.

5. Aggregate — Model and ModelDeployment

export interface Model {
readonly id: ModelId;
readonly ref: ModelRef;
readonly modality: Modality;
readonly contextWindowTokens: number | null; // null for non-LLM
readonly costClass: CostClass;
readonly latencyClass: LatencyClass;
readonly status: ModelStatus;
readonly perTokenCostMicrosIn?: number; // billing factor (cloud only)
readonly perTokenCostMicrosOut?: number;
readonly perCallCostMicros?: number; // for vision / OCR / STT pricing
readonly notes?: string;
readonly addedAt: string;
}

export interface ModelDeployment {
readonly id: ModelDeploymentId;
readonly modelId: ModelId;
readonly region: string; // 'me-central1' | 'europe-west4' | ...
readonly trafficSharePct: number; // 0..100
readonly status: 'pending' | 'active' | 'draining' | 'retired';
readonly activatedAt?: string;
readonly drainStartedAt?: string;
readonly notes?: string;
}

6. Aggregate — Provider

export interface Provider {
readonly id: ProviderId;
readonly name: ProviderName;
readonly health: ProviderHealth;
readonly consecutiveErrors: number; // resets on success
readonly lastErrorAt?: string;
readonly lastSuccessAt?: string;
readonly circuitOpenedAt?: string; // when health → 'unhealthy'
readonly probeIntervalMs: number; // half-open probe cadence
readonly notes?: string;
}

Circuit-breaker rules:

  • consecutiveErrors >= 5 ⇒ health → unhealthy; circuit opened.
  • After probeIntervalMs, the next call is a probe; success → recoveringhealthy; failure → reset timer.
  • Health changes emit melmastoon.ai_orchestrator.model.deployment_changed.v1 with { provider, healthBefore, healthAfter }.

7. Aggregate — InferenceRequest, InferenceResult, Provenance

export interface InferenceRequest {
readonly id: InferenceRequestId;
readonly tenantId: string;
readonly capabilityKey: string;
readonly promptVersionId: PromptVersionId | null; // null when no prompt (e.g., embedding, vision)
readonly inputHash: InputHash;
readonly inputBytes: number; // pre-redaction byte size
readonly redactedInputHash: InputHash; // post-redaction
readonly correlation: { traceId: string; requestId: string };
readonly receivedAt: string;
readonly callerService: string; // 'pricing-service'
readonly callerSurface?: 'consumer' | 'tenant-booking' | 'backoffice';
readonly tenantRegionPin?: string;
}

export interface InferenceResult {
readonly id: InferenceResultId;
readonly requestId: InferenceRequestId;
readonly status: 'completed' | 'failed' | 'fallback_deterministic';
readonly provenanceId: ProvenanceId;
readonly outputJson?: unknown; // schema-validated structured output
readonly errorCode?: string; // 'MELMASTOON.AI.PROVIDER_UNAVAILABLE' etc.
readonly hitlGateId?: HitlGateId;
readonly cached: boolean;
readonly latencyMs: number;
readonly completedAt: string;
}

Provenance is reproduced in §2; persisted via provenances table (see DATA_MODEL.md).

8. Aggregate — BudgetCounter

export interface BudgetCounter {
readonly id: BudgetCounterId;
readonly tenantId: string;
readonly scope: // hierarchical
| { kind: 'tenant_total' }
| { kind: 'capability'; capabilityKey: string }
| { kind: 'feature'; featureKey: string };
readonly periodKey: string; // 'YYYY-MM' for monthly; 'YYYY-MM-DD' for daily
readonly tokensUsed: number;
readonly costMicrosUsed: number;
readonly tokensCap: number;
readonly costMicrosCap: number;
readonly softCapPct: number; // default 80
readonly hardCapPct: number; // default 100
readonly softCapWarnedAt?: string; // emitted exactly once per period
readonly hardCapTrippedAt?: string;
readonly resetsAt: string;
}

Increment rules:

  • Atomic UPDATE returning the new used totals; if the increment would cross the soft cap, emit budget.warning.v1; if it would exceed the hard cap, the call routes to deterministic fallback and budget.exceeded.v1 is emitted.
  • Cache hits do not increment tokens or cost.

9. Aggregate — HitlGate and HitlDecision

export interface HitlGate {
readonly id: HitlGateId;
readonly tenantId: string;
readonly capabilityKey: string;
readonly artifactRef: { kind: string; id: string }; // e.g., { kind: 'pricing-suggestion', id: 'ifs_...' }
readonly draftJson: unknown; // the AI artifact awaiting decision
readonly status: HitlGateStatus;
readonly openedAt: string;
readonly slaDeadline: string;
readonly reviewerRoles: readonly string[];
readonly notificationsSent: number;
readonly decisionId?: HitlDecisionId;
readonly closedAt?: string;
readonly correlation: { traceId: string; requestId: string };
}

export interface HitlDecision {
readonly id: HitlDecisionId;
readonly gateId: HitlGateId;
readonly outcome: HitlOutcome;
readonly modifiedJson?: unknown; // present when outcome === 'modified'
readonly justification?: string; // required on 'rejected'
readonly reviewerUserId: string; // 'usr_...'
readonly reviewerRole: string;
readonly decidedAt: string;
readonly auto: boolean; // true on timeout-default
}

State machine:

open ──── decision submitted ────▶ decided ────▶ closed

└── slaDeadline passed ────▶ timed_out ────▶ closed (auto: true, outcome: defaultOnTimeout)

10. Aggregate — RAGCorpus and Embedding

export interface RAGCorpus {
readonly id: RAGCorpusId;
readonly tenantId: string;
readonly namespace: string; // 'policies' | 'faq' | 'sop' | 'amenity'
readonly chunkStrategy: { method: 'fixed' | 'semantic'; targetTokens: number; overlap: number };
readonly embeddingModel: ModelRef; // text-embedding-004 (cloud) or all-MiniLM-L6-v2 (edge)
readonly status: 'provisioning' | 'active' | 'archived';
readonly chunkCount: number;
readonly lastReindexAt?: string;
readonly createdAt: string;
}

export interface Embedding {
readonly corpusId: RAGCorpusId;
readonly chunkId: string; // ULID
readonly tenantId: string;
readonly vector: Float32Array; // 768 (cloud) or 384 (edge)
readonly chunkText: string; // PII-redacted; original kept in source bucket
readonly metadata: Record<string, string | number | boolean>;
readonly sourceUri?: string;
readonly createdAt: string;
}

11. Aggregate — EdgeModelManifest

export interface EdgeModelEntry {
readonly modelKey: string; // 'phi-3-mini-4k-instruct'
readonly fileName: string; // 'phi-3-mini-int4.onnx'
readonly sha256: string;
readonly bytes: number;
readonly minRamMb: number;
readonly idleUnloadMinutes: number;
readonly capabilities: readonly string[]; // capability keys this model serves
}

export interface EdgeModelManifest {
readonly id: EdgeManifestId;
readonly version: string; // semver, e.g. '2.4.1'
readonly status: 'draft' | 'published' | 'superseded';
readonly models: readonly EdgeModelEntry[];
readonly signature: { kmsKeyId: string; algorithm: 'RSASSA_PSS_SHA_256'; valueB64: string };
readonly publishedAt?: string;
readonly supersededAt?: string;
readonly supersedesId?: EdgeManifestId;
}

Verification rule (enforced both at Electron load time and inside the gateway when re-publishing): verifyKmsSignature(canonicalize(models, version), signature) MUST succeed.

12. Domain Events (catalog summary; full schemas in EVENT_SCHEMAS.md)

EventAggregateTrigger
melmastoon.ai_orchestrator.inference.requested.v1InferenceRequestaccepted call
melmastoon.ai_orchestrator.inference.completed.v1InferenceResultstructured output returned
melmastoon.ai_orchestrator.inference.failed.v1InferenceResultfallback chain exhausted
melmastoon.ai_orchestrator.inference.cached_hit.v1InferenceResultcache hit
melmastoon.ai_orchestrator.suggestion.dynamic_pricing.v1capability-specificpricing.suggest produced
melmastoon.ai_orchestrator.suggestion.demand_forecast.v1capability-specificpricing.demand_forecast produced
melmastoon.ai_orchestrator.suggestion.housekeeping_routing.v1capability-specifichousekeeping.route produced
melmastoon.ai_orchestrator.suggestion.shift_optimization.v1capability-specificstaff.shift_optimize produced
melmastoon.ai_orchestrator.anomaly.detected.v1capability-specificanomaly.detect flagged
melmastoon.ai_orchestrator.upsell.recommended.v1capability-specificupsell.recommend produced
melmastoon.ai_orchestrator.message.drafted.v1capability-specific + HITLalways opens HITL gate
melmastoon.ai_orchestrator.review.summarized.v1capability-specificreview.summarize produced
melmastoon.ai_orchestrator.ocr.completed.v1capability-specific + HITLalways opens HITL gate
melmastoon.ai_orchestrator.transcription.completed.v1capability-specificstt.transcribe produced
melmastoon.ai_orchestrator.description.drafted.v1capability-specific + HITLtenant accepts before publish
melmastoon.ai_orchestrator.translation.drafted.v1capability-specific + HITLtenant per-locale review
melmastoon.ai_orchestrator.hitl.gate_opened.v1HitlGategate created
melmastoon.ai_orchestrator.hitl.gate_decided.v1HitlDecisiondecision recorded
melmastoon.ai_orchestrator.budget.warning.v1BudgetCountersoft cap crossed
melmastoon.ai_orchestrator.budget.exceeded.v1BudgetCounterhard cap crossed
melmastoon.ai_orchestrator.eval.run_completed.v1EvalRuneval finished
melmastoon.ai_orchestrator.prompt.version_published.v1PromptVersionpromoted to active
melmastoon.ai_orchestrator.model.deployment_changed.v1ModelDeployment / Providertraffic shift / health change
melmastoon.ai_orchestrator.edge_model.manifest_updated.v1EdgeModelManifestpublished
melmastoon.ai_orchestrator.moderation.flagged.v1InferenceResultpre or post moderation block

13. Domain Errors

ErrorSurface as
CrossTenantReferenceErrorMELMASTOON.GENERAL.CROSS_TENANT_REFERENCE (any aggregate constructed from a different tenant's primitive)
BudgetExceededErrorMELMASTOON.AI.REFUSED_BUDGET
SafetyRefusedErrorMELMASTOON.AI.REFUSED_SAFETY
OutputSchemaInvalidErrorMELMASTOON.AI.OUTPUT_INVALID (after one repair attempt)
ProviderUnavailableErrorMELMASTOON.AI.PROVIDER_UNAVAILABLE (after fallback chain exhausted)
HitlRequiredErrorMELMASTOON.AI.HITL_REQUIRED (caller attempted to bypass gate)
ProvenanceMissingErrorMELMASTOON.AI.PROVENANCE_MISSING (defensive; should never reach a sibling service)
EdgeManifestSignatureInvalidErrordesktop-side; refuses load
PromptInjectionDetectedErrorabsorbed into MELMASTOON.AI.REFUSED_SAFETY with detail: 'injection_pattern'
RAGCrossTenantErrorMELMASTOON.GENERAL.CROSS_TENANT_REFERENCE

14. Cross-Aggregate Invariants

  1. Every InferenceResult has a Provenance. There is no path to persist a result without one.
  2. Every HitlDecision references exactly one HitlGate. The reverse is enforced by FK + uniqueness.
  3. Every Capability active row has at least one PromptVersion active.
  4. Every Embedding row carries tenantId matching its parent RAGCorpus.tenantId. Both are RLS-enforced; the domain layer also asserts the match.
  5. Every EvalRun is bound to one (EvalSuite, PromptVersion, Model) tuple.
  6. BudgetCounter increments are idempotent on (tenantId, scope, periodKey, requestId).
  7. EdgeModelManifest.signature MUST verify against the KMS public key embedded in the Electron binary at load time.