Skip to main content

Domain Model

:::info Source Sourced from services/search-service/DOMAIN_MODEL.md in the documentation repo. :::

Companion: SERVICE_OVERVIEW.md · DATA_MODEL.md · APPLICATION_LOGIC.md

Search-service is a read-model context. It has no native business invariants beyond "every document is a faithful projection of exactly one authoritative source." This doc captures the aggregates, value objects, invariants, and the domain language used throughout the code.

1. Ubiquitous Language

TermMeaning
SearchableDocumentA denormalized, index-ready record built by projecting one or more domain events.
IndexA logical tenant-scoped bucket of SearchableDocuments in OpenSearch.
AliasAn OpenSearch alias that maps logical index names to physical indices (allows zero-downtime rebuild).
ProjectionThe process of turning a domain event into a SearchableDocument upsert.
EmbeddingA fixed-dim float vector produced by an embedding model (via ai-gateway) representing semantic content.
Hybrid scorealpha * lex + (1 - alpha) * sem, typically alpha = 0.5.
RecommendationA ranked list of items for a specific user derived from embeddings + collaborative filtering + rules.
ReindexFull rebuild of a tenant index from the domain event log + snapshot API.
BackfillOne-shot ingest from a source service's snapshot endpoint (used on cold start).
VisibilityEnum governing who can see a document: private, org, marketplace, public.

2. Aggregates

2.1 SearchableDocument (aggregate root)

type DocumentType =
| 'course'
| 'lesson'
| 'block'
| 'listing'
| 'user'
| 'assignment'
| 'certificate';

interface SearchableDocument {
// Identity
id: string; // globally unique "svc:type:aggregateId" form
tenantId: TenantId;

// Typing
type: DocumentType;
source: { service: string; aggregateId: string; aggregateVersion: number };

// Content
title: I18nString; // { en: 'Intro to Algebra', ar: 'مقدمة في الجبر' }
body: I18nMarkup; // sanitized markdown per locale
summary?: I18nString;

// Categorization
tags: string[]; // free-form
taxonomy: string[]; // catalog taxonomy node IDs (dot-path)
facets: Record<string, JSONValue>; // e.g. { level: 'beginner', durationMin: 45 }

// Access
visibility: 'private' | 'org' | 'marketplace' | 'public';
audiences?: string[]; // role/cohort IDs the doc is meaningful to

// Semantic layer
embedding?: number[]; // 768 or 1024 dim depending on model
embeddingModelId?: string; // e.g. 'text-embed-3-small@2025-04-01'
embeddingHash?: string; // sha256 over content used to build embedding

// Lifecycle
publishedAt?: ISODate;
updatedAt: ISODate;
deletedAt?: ISODate; // soft delete (tombstone)

// Metadata
locale: BCP47; // primary locale for tie-breaking
region: 'us' | 'eu' | 'me' | 'ap';
quality?: { ratingAvg?: number; enrollmentCount?: number; completionRate?: number };
}

Invariants

#InvariantEnforced by
D1tenantId is required and immutableProjector validation + OpenSearch mapping
D2id form is {service}:{type}:{aggregateId}; uniqueness per tenantIndexer
D3aggregateVersion monotonically non-decreasing per idIndexer with if_seq_no/ version compare
D4If embedding present, embeddingModelId and embeddingHash must also be presentIndexer validation
D5visibility = marketplace forbidden unless source is marketplace-serviceProjector policy
D6deletedAt set → no fulltext match; retained for tombstone replayQuery builder + TTL
D7No PII in body for documents with visibility ∈ {marketplace, public}Sanitizer (see AI_INTEGRATION §6)
D8region equals tenant data-residency regionProjector

2.2 Recommendation (aggregate root)

interface Recommendation {
id: ULID; // generation id
tenantId: TenantId;
userId: UserId;
generatedAt: ISODate;
modelVersion: string; // 'rec-l2r@2025-03-14'
items: RecommendedItem[];
expiresAt: ISODate;
reasonSummary?: string;
}

interface RecommendedItem {
itemId: string;
itemType: DocumentType;
score: number; // normalized 0..1
reason: 'similar_to_enrolled' | 'completed_next_step' | 'cohort_peer' | 'trending' | 'ai_editor_pick';
explanation?: string; // surfaced in UI ("because you completed X")
features?: Record<string, number>; // sparse top-k feature contribs (for debugging)
}

Invariants:

  • R1: items[].itemType ∈ {course, lesson, listing} only.
  • R2: expiresAt - generatedAt ≤ 24h (freshness cap).
  • R3: Items must pass authorization filter at serve-time — the list is not pre-authorized.

2.3 IndexPolicy (aggregate root)

Governs per-tenant index behavior (shard count, embedding model, retention).

interface IndexPolicy {
tenantId: TenantId;
alias: string; // 'tenant_01H...'
physicalIndex: string; // 'tenant_01H..._2025-04-15'
primaryShards: number; // default 1; bumped for largest tenants
replicas: number;
embeddingModelId: string;
reindexVersion: number; // bump to trigger rebuild
createdAt: ISODate;
lastReindexAt?: ISODate;
}

2.4 SuggestEntry (value object)

interface SuggestEntry {
text: string;
weight: number; // from CTR + recency
contexts: { tenantId: TenantId; type?: DocumentType; locale?: BCP47 };
}

3. Bounded Context Boundaries

Search-service does not model:

  • Course structure → catalog-service.
  • Enrollment/purchase state → enrollment-service.
  • Authoring draft state → authoring-service.
  • User identity → identity-service.

When a search result is clicked, the authoritative service is queried for the current state. Search indexes may be stale by up to 2s (SLO).

4. Document Type Matrix

TypeSourced fromKey fieldsVisibility rules
coursecatalog (course_version.published)title, summary, taxonomy, level, durationMinorg within tenant; marketplace if listed; public if opened
lessoncatalog + authoringtitle, body, blocks summary, parent courseinherits from course
blockauthoring (block.updated)block body slice, type, parent lessonorg only (drafts not searchable across org)
listingmarketplace (listing.approved)title, description, price, ratingalways marketplace (cross-tenant for buyers)
usertenant (membership_activated)display name, roles, skillsorg only, never marketplace
assignmentassignment-servicetitle, description, due dateorg only
certificatecertification (certificate.issued)title, recipient, issuedAtprivate + optional share link (public URL but not indexed public)

5. State Transitions

5.1 Document lifecycle

5.2 Recommendation generation

6. Policies

PolicyRule
Deletion policyTombstone retained 30d; then hard-delete
Visibility demotionAny event that demotes visibility triggers immediate reindex
Locale fallbackIf query locale missing in document, fallback = tenant default locale
Embedding driftAny change in embeddingModelId forces re-embedding over ≤ 14d rolling window
Tenant deletionAlias + physical index hard-deleted within 24h (GDPR erasure)

7. Domain Events (internal)

Search-service emits two internal events for its own state changes:

  • search.document.indexed.v1 — bookkeeping / analytics.
  • search.reindex.completed.v1 — signals tenants can rely on freshness.

See EVENT_SCHEMAS.md for full schemas.