Skip to main content

ai-orchestrator-service — Data Model

Companion to: DOMAIN_MODEL.md · Standards: 06 Data Models · NAMING · 07 Security & Tenancy

Storage:

  • Cloud SQL Postgres 16 (HA, regional) with extensions: pgcrypto, pg_trgm, vector (pgvector ≥ 0.7), pg_partman (for high-volume audit tables).
  • Memorystore Redis (HA): prompt+input hash cache, hot capability config, in-flight HITL SLA timers, per-(tenant, capability) token-bucket rate limiter.
  • GCS gs://melmastoon-ai-artifacts-<env> for eval datasets, signed ONNX models, prompt fixtures, RAG raw documents pre-chunking.
  • BigQuery melmastoon_analytics_prod.ai_calls_fact for analytics + cost dashboard.

Money is stored in bigint micros (_micros suffix). All multi-tenant tables carry tenant_id uuid NOT NULL + RLS policy <table>_tenant_isolation + index on (tenant_id, …). Session GUC app.tenant_id is set on every connection from the application.

1. ID prefixes (declared here per NAMING.md)

PrefixAggregateOwning service
cap_Capabilityai-orchestrator-service
prm_Promptai-orchestrator-service
pmv_PromptVersionai-orchestrator-service
mdl_Modelai-orchestrator-service
mdp_ModelDeploymentai-orchestrator-service
prv_Providerai-orchestrator-service
ifr_InferenceRequestai-orchestrator-service
ifs_InferenceResultai-orchestrator-service
prv_p_Provenanceai-orchestrator-service (extends prv_ namespace; the trailing _p_ disambiguates)
eva_EvalSuiteai-orchestrator-service
evr_EvalRunai-orchestrator-service
rag_RAGCorpusai-orchestrator-service
bdg_BudgetCounterai-orchestrator-service
hgt_HitlGateai-orchestrator-service
dec_HitlDecisionai-orchestrator-service
emm_EdgeModelManifestai-orchestrator-service

2. Schema overview (ER summary)

capabilities ──┐ eval_suites ◀── eval_runs
├──▶ prompts ──▶ prompt_versions ──┐ ▲
│ └──▶ models ─┐ │
└──────────────────────────────────────────────▶│ │
│ │
inference_requests ──▶ inference_results ──▶ provenances ──────┘ │

budget_counters │

hitl_gates ──▶ hitl_decisions │

rag_corpora ──▶ rag_chunks (1) ──▶ embeddings_cloud / embeddings_edge │

edge_model_manifests │

providers (singleton-per-name; carries circuit state) │
model_deployments │
moderation_audit │
prompt_cache (Redis primary; Postgres backstop for audit) │
ab_assignments (sticky-by-tenant for prompt experiments) ─────────────┘

3. Capability catalog

CREATE TABLE capabilities (
id text PRIMARY KEY, -- 'cap_<ULID>'
key text NOT NULL UNIQUE, -- 'pricing.suggest'
display_name text NOT NULL,
status text NOT NULL CHECK (status IN ('draft','active','deprecated','retired')),
default_prompt_version_id text REFERENCES prompt_versions(id),
default_model_provider text NOT NULL CHECK (default_model_provider IN ('vertex','anthropic','openai','onnx-edge')),
default_model_name text NOT NULL,
fallback_chain jsonb NOT NULL, -- ordered FallbackChain
latency_class text NOT NULL CHECK (latency_class IN ('very_low','low','medium','high')),
cost_class text NOT NULL CHECK (cost_class IN ('free','very_low','low','medium','high')),
output_schema_json jsonb NOT NULL,
hitl_config jsonb NOT NULL, -- HitlConfig
eval_suite_id text NOT NULL REFERENCES eval_suites(id),
cache_ttl_seconds int, -- null disables cache
tenant_opt_out_allowed boolean NOT NULL DEFAULT true,
version int NOT NULL DEFAULT 1,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX ix_capabilities_status ON capabilities (status) WHERE status = 'active';

Capability rows are platform-global, not per-tenant. Per-tenant overrides (e.g., disabling translation drafts for a specific tenant) live in tenant_capability_overrides:

CREATE TABLE tenant_capability_overrides (
tenant_id uuid NOT NULL,
capability_key text NOT NULL,
enabled boolean NOT NULL,
budget_overrides jsonb, -- per-tenant cap overrides
notes text,
updated_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (tenant_id, capability_key)
);
ALTER TABLE tenant_capability_overrides ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_capability_overrides_tenant_isolation ON tenant_capability_overrides
USING (tenant_id::text = current_setting('app.tenant_id', true));

4. Prompt registry

CREATE TABLE prompts (
id text PRIMARY KEY, -- 'prm_<ULID>'
domain text NOT NULL CHECK (domain IN (
'PRICING','HK','STAFF','ANOMALY','UPSELL',
'MSG','REVIEW','BOOKING','TUTOR','DESC',
'TRANSLATE','OCR','STT','VISION')),
ordinal int NOT NULL,
display_name text NOT NULL,
capability_key text NOT NULL REFERENCES capabilities(key),
owner_user_id uuid NOT NULL,
active_version_id text REFERENCES prompt_versions(id),
created_at timestamptz NOT NULL DEFAULT now(),
UNIQUE (domain, ordinal)
);

CREATE TABLE prompt_versions (
id text PRIMARY KEY, -- 'pmv_<ULID>'
prompt_id text NOT NULL REFERENCES prompts(id) ON DELETE RESTRICT,
version int NOT NULL,
canonical_code text NOT NULL UNIQUE, -- 'PRMP_PRICING_001_v3'
status text NOT NULL CHECK (status IN ('draft','active','deprecated','retired')),
system_prompt text NOT NULL,
user_template text NOT NULL,
output_schema_json jsonb NOT NULL,
default_model_provider text NOT NULL,
default_model_name text NOT NULL,
eval_suite_id text NOT NULL REFERENCES eval_suites(id),
notes text,
created_at timestamptz NOT NULL DEFAULT now(),
activated_at timestamptz,
deprecated_at timestamptz,
retired_at timestamptz,
CONSTRAINT prompt_versions_one_active UNIQUE (prompt_id, status) DEFERRABLE INITIALLY DEFERRED
-- Enforced via partial unique index, see below; the column-list form is not how Postgres uniques work.
);
DROP INDEX IF EXISTS prompt_versions_one_active;
CREATE UNIQUE INDEX prompt_versions_one_active
ON prompt_versions (prompt_id) WHERE status = 'active';

CREATE INDEX ix_prompt_versions_capability_status ON prompt_versions (default_model_name, status);

A/B sticky assignments:

CREATE TABLE ab_assignments (
tenant_id uuid NOT NULL,
capability_key text NOT NULL,
ab_bucket int NOT NULL CHECK (ab_bucket BETWEEN 0 AND 99),
assigned_to_version_id text NOT NULL REFERENCES prompt_versions(id),
assigned_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (tenant_id, capability_key)
);
ALTER TABLE ab_assignments ENABLE ROW LEVEL SECURITY;
CREATE POLICY ab_assignments_tenant_isolation ON ab_assignments
USING (tenant_id::text = current_setting('app.tenant_id', true));

5. Model catalog

CREATE TABLE models (
id text PRIMARY KEY, -- 'mdl_<ULID>'
provider text NOT NULL CHECK (provider IN ('vertex','anthropic','openai','onnx-edge')),
name text NOT NULL,
modality text NOT NULL CHECK (modality IN ('llm','embedding','vision','speech','classifier')),
context_window_tokens int,
cost_class text NOT NULL,
latency_class text NOT NULL,
status text NOT NULL CHECK (status IN ('available','deprecated','retired')),
per_token_cost_micros_in bigint,
per_token_cost_micros_out bigint,
per_call_cost_micros bigint,
notes text,
added_at timestamptz NOT NULL DEFAULT now(),
UNIQUE (provider, name)
);

CREATE TABLE model_deployments (
id text PRIMARY KEY, -- 'mdp_<ULID>'
model_id text NOT NULL REFERENCES models(id),
region text NOT NULL,
traffic_share_pct int NOT NULL CHECK (traffic_share_pct BETWEEN 0 AND 100),
status text NOT NULL CHECK (status IN ('pending','active','draining','retired')),
activated_at timestamptz,
drain_started_at timestamptz,
notes text
);
CREATE INDEX ix_model_deployments_model_status ON model_deployments (model_id, status);

CREATE TABLE providers (
id text PRIMARY KEY, -- 'prv_<ULID>'
name text NOT NULL UNIQUE CHECK (name IN ('vertex','anthropic','openai','onnx-edge')),
health text NOT NULL CHECK (health IN ('healthy','degraded','unhealthy','recovering')),
consecutive_errors int NOT NULL DEFAULT 0,
last_error_at timestamptz,
last_success_at timestamptz,
circuit_opened_at timestamptz,
probe_interval_ms int NOT NULL DEFAULT 30000,
notes text
);

6. Inference + result + provenance (audit)

These tables grow fastest. Partitioned by month with pg_partman. Older partitions move to BigQuery archive after 90 days; Postgres keeps 90 days hot.

CREATE TABLE inference_requests (
id text NOT NULL, -- 'ifr_<ULID>'
tenant_id uuid NOT NULL,
capability_key text NOT NULL,
prompt_version_id text REFERENCES prompt_versions(id),
input_hash text NOT NULL,
redacted_input_hash text NOT NULL,
input_bytes int NOT NULL,
caller_service text NOT NULL,
caller_surface text,
trace_id text NOT NULL,
request_id text NOT NULL,
region_pin text,
received_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (id, received_at)
) PARTITION BY RANGE (received_at);

CREATE TABLE inference_results (
id text NOT NULL, -- 'ifs_<ULID>'
request_id text NOT NULL,
tenant_id uuid NOT NULL,
status text NOT NULL CHECK (status IN ('completed','failed','fallback_deterministic')),
provenance_id text NOT NULL REFERENCES provenances(id),
output_json jsonb,
error_code text,
hitl_gate_id text REFERENCES hitl_gates(id),
cached boolean NOT NULL DEFAULT false,
latency_ms int NOT NULL,
completed_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (id, completed_at)
) PARTITION BY RANGE (completed_at);

CREATE TABLE provenances (
id text PRIMARY KEY, -- 'prv_p_<ULID>'
tenant_id uuid NOT NULL,
prompt_version_id text REFERENCES prompt_versions(id),
prompt_version_no int,
model_provider text NOT NULL,
model_name text NOT NULL,
model_version text,
trace_id text NOT NULL,
occurred_at timestamptz NOT NULL,
tokens_in int NOT NULL DEFAULT 0,
tokens_out int NOT NULL DEFAULT 0,
cost_micros bigint NOT NULL DEFAULT 0,
local boolean NOT NULL DEFAULT false,
cache_hit boolean NOT NULL DEFAULT false,
safety_input text NOT NULL CHECK (safety_input IN ('pass','flag_low','flag_high','block')),
safety_output text NOT NULL CHECK (safety_output IN ('pass','flag_low','flag_high','block')),
reviewed_by uuid,
reviewed_at timestamptz,
decision text CHECK (decision IN ('accepted','modified','rejected'))
);

CREATE INDEX ix_provenances_tenant_occurred ON provenances (tenant_id, occurred_at DESC);
CREATE INDEX ix_inference_requests_tenant_capability ON inference_requests (tenant_id, capability_key, received_at DESC);
CREATE INDEX ix_inference_results_tenant_provenance ON inference_results (tenant_id, provenance_id);

ALTER TABLE inference_requests ENABLE ROW LEVEL SECURITY;
ALTER TABLE inference_results ENABLE ROW LEVEL SECURITY;
ALTER TABLE provenances ENABLE ROW LEVEL SECURITY;

CREATE POLICY inference_requests_tenant_isolation ON inference_requests
USING (tenant_id::text = current_setting('app.tenant_id', true));
CREATE POLICY inference_results_tenant_isolation ON inference_results
USING (tenant_id::text = current_setting('app.tenant_id', true));
CREATE POLICY provenances_tenant_isolation ON provenances
USING (tenant_id::text = current_setting('app.tenant_id', true));

7. Budget counters

CREATE TABLE budget_counters (
id text PRIMARY KEY, -- 'bdg_<ULID>'
tenant_id uuid NOT NULL,
scope_kind text NOT NULL CHECK (scope_kind IN ('tenant_total','capability','feature')),
scope_key text, -- capability key or feature key
period_key text NOT NULL, -- 'YYYY-MM' or 'YYYY-MM-DD'
tokens_used bigint NOT NULL DEFAULT 0,
cost_micros_used bigint NOT NULL DEFAULT 0,
tokens_cap bigint NOT NULL,
cost_micros_cap bigint NOT NULL,
soft_cap_pct int NOT NULL DEFAULT 80,
hard_cap_pct int NOT NULL DEFAULT 100,
soft_cap_warned_at timestamptz,
hard_cap_tripped_at timestamptz,
resets_at timestamptz NOT NULL,
UNIQUE (tenant_id, scope_kind, scope_key, period_key)
);
ALTER TABLE budget_counters ENABLE ROW LEVEL SECURITY;
CREATE POLICY budget_counters_tenant_isolation ON budget_counters
USING (tenant_id::text = current_setting('app.tenant_id', true));
CREATE INDEX ix_budget_counters_tenant_period ON budget_counters (tenant_id, period_key);

8. HITL gates + decisions

CREATE TABLE hitl_gates (
id text PRIMARY KEY, -- 'hgt_<ULID>'
tenant_id uuid NOT NULL,
capability_key text NOT NULL,
artifact_kind text NOT NULL,
artifact_id text NOT NULL,
draft_json jsonb NOT NULL,
status text NOT NULL CHECK (status IN ('open','decided','timed_out','closed')),
opened_at timestamptz NOT NULL DEFAULT now(),
sla_deadline timestamptz NOT NULL,
reviewer_roles text[] NOT NULL,
notifications_sent int NOT NULL DEFAULT 0,
decision_id text REFERENCES hitl_decisions(id),
closed_at timestamptz,
trace_id text NOT NULL,
request_id text NOT NULL,
version int NOT NULL DEFAULT 1
);
CREATE INDEX ix_hitl_gates_tenant_status ON hitl_gates (tenant_id, status, sla_deadline);
ALTER TABLE hitl_gates ENABLE ROW LEVEL SECURITY;
CREATE POLICY hitl_gates_tenant_isolation ON hitl_gates
USING (tenant_id::text = current_setting('app.tenant_id', true));

CREATE TABLE hitl_decisions (
id text PRIMARY KEY, -- 'dec_<ULID>'
gate_id text NOT NULL UNIQUE REFERENCES hitl_gates(id),
tenant_id uuid NOT NULL,
outcome text NOT NULL CHECK (outcome IN ('accepted','modified','rejected')),
modified_json jsonb,
justification text,
reviewer_user_id uuid NOT NULL,
reviewer_role text NOT NULL,
decided_at timestamptz NOT NULL DEFAULT now(),
auto boolean NOT NULL DEFAULT false
);
ALTER TABLE hitl_decisions ENABLE ROW LEVEL SECURITY;
CREATE POLICY hitl_decisions_tenant_isolation ON hitl_decisions
USING (tenant_id::text = current_setting('app.tenant_id', true));

9. Eval suites + runs

CREATE TABLE eval_suites (
id text PRIMARY KEY, -- 'eva_<ULID>'
capability_key text NOT NULL,
name text NOT NULL,
rubric_json jsonb NOT NULL, -- declared scoring metrics
fixture_uri text NOT NULL, -- 'gs://melmastoon-ai-artifacts/eval/PRICING/001.jsonl'
example_count int NOT NULL,
version int NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);

CREATE TABLE eval_runs (
id text PRIMARY KEY, -- 'evr_<ULID>'
suite_id text NOT NULL REFERENCES eval_suites(id),
prompt_version_id text NOT NULL REFERENCES prompt_versions(id),
model_provider text NOT NULL,
model_name text NOT NULL,
status text NOT NULL CHECK (status IN ('queued','running','completed','failed')),
scores_json jsonb,
comparison_json jsonb,
verdict text CHECK (verdict IN ('green','red','inconclusive')),
duration_ms int,
started_at timestamptz,
completed_at timestamptz,
triggered_by uuid,
notes text
);
CREATE INDEX ix_eval_runs_prompt_status ON eval_runs (prompt_version_id, status, completed_at DESC);

10. RAG corpora + embeddings (pgvector)

Two embedding tables — cloud (768-dim) and edge (384-dim). Both per-tenant with HNSW indexes.

CREATE TABLE rag_corpora (
id text PRIMARY KEY, -- 'rag_<ULID>'
tenant_id uuid NOT NULL,
namespace text NOT NULL CHECK (namespace IN ('policies','faq','sop','amenity','custom')),
chunk_strategy_json jsonb NOT NULL,
embedding_provider text NOT NULL CHECK (embedding_provider IN ('vertex','onnx-edge')),
embedding_model_name text NOT NULL,
embedding_dim int NOT NULL CHECK (embedding_dim IN (384, 768)),
status text NOT NULL CHECK (status IN ('provisioning','active','archived')),
chunk_count int NOT NULL DEFAULT 0,
last_reindex_at timestamptz,
created_at timestamptz NOT NULL DEFAULT now(),
UNIQUE (tenant_id, namespace)
);
ALTER TABLE rag_corpora ENABLE ROW LEVEL SECURITY;
CREATE POLICY rag_corpora_tenant_isolation ON rag_corpora
USING (tenant_id::text = current_setting('app.tenant_id', true));

CREATE TABLE rag_chunks (
corpus_id text NOT NULL REFERENCES rag_corpora(id) ON DELETE CASCADE,
chunk_id text NOT NULL, -- ULID; deterministic via sha256(content)[:26]
tenant_id uuid NOT NULL,
chunk_text text NOT NULL, -- PII-redacted; original kept in source bucket
metadata jsonb NOT NULL DEFAULT '{}'::jsonb,
source_uri text,
created_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (corpus_id, chunk_id)
);
ALTER TABLE rag_chunks ENABLE ROW LEVEL SECURITY;
CREATE POLICY rag_chunks_tenant_isolation ON rag_chunks
USING (tenant_id::text = current_setting('app.tenant_id', true));

CREATE TABLE embeddings_cloud ( -- 768-dim for text-embedding-004
corpus_id text NOT NULL,
chunk_id text NOT NULL,
tenant_id uuid NOT NULL,
vector vector(768) NOT NULL,
PRIMARY KEY (corpus_id, chunk_id),
FOREIGN KEY (corpus_id, chunk_id) REFERENCES rag_chunks(corpus_id, chunk_id) ON DELETE CASCADE
);
ALTER TABLE embeddings_cloud ENABLE ROW LEVEL SECURITY;
CREATE POLICY embeddings_cloud_tenant_isolation ON embeddings_cloud
USING (tenant_id::text = current_setting('app.tenant_id', true));

CREATE INDEX ix_embeddings_cloud_hnsw
ON embeddings_cloud USING hnsw (vector vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

CREATE INDEX ix_embeddings_cloud_tenant ON embeddings_cloud (tenant_id);

CREATE TABLE embeddings_edge ( -- 384-dim for all-MiniLM-L6-v2 (offline RAG)
corpus_id text NOT NULL,
chunk_id text NOT NULL,
tenant_id uuid NOT NULL,
vector vector(384) NOT NULL,
PRIMARY KEY (corpus_id, chunk_id),
FOREIGN KEY (corpus_id, chunk_id) REFERENCES rag_chunks(corpus_id, chunk_id) ON DELETE CASCADE
);
ALTER TABLE embeddings_edge ENABLE ROW LEVEL SECURITY;
CREATE POLICY embeddings_edge_tenant_isolation ON embeddings_edge
USING (tenant_id::text = current_setting('app.tenant_id', true));

CREATE INDEX ix_embeddings_edge_hnsw
ON embeddings_edge USING hnsw (vector vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Per-call query pattern:

SET LOCAL app.tenant_id = 'tnt_01H...';
SET LOCAL hnsw.ef_search = 40;
SELECT chunk_id, 1 - (vector <=> $1::vector) AS score
FROM embeddings_cloud
WHERE tenant_id = $2 AND corpus_id = $3
ORDER BY vector <=> $1::vector
LIMIT $4;

tenant_id = $2 is redundant given RLS, but the application layer asserts it as defense in depth.

11. Edge model manifest

CREATE TABLE edge_model_manifests (
id text PRIMARY KEY, -- 'emm_<ULID>'
version text NOT NULL UNIQUE, -- '2.4.1'
status text NOT NULL CHECK (status IN ('draft','published','superseded')),
models jsonb NOT NULL, -- EdgeModelEntry[]
signature jsonb NOT NULL, -- { kmsKeyId, algorithm, valueB64 }
supersedes_id text REFERENCES edge_model_manifests(id),
published_at timestamptz,
superseded_at timestamptz,
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX edge_model_manifests_one_published
ON edge_model_manifests (status) WHERE status = 'published';

12. Moderation audit + cache backstop

CREATE TABLE moderation_audit (
id bigserial PRIMARY KEY,
tenant_id uuid NOT NULL,
request_id text NOT NULL,
capability_key text NOT NULL,
side text NOT NULL CHECK (side IN ('input','output')),
verdict text NOT NULL,
axes jsonb NOT NULL,
action text NOT NULL CHECK (action IN ('allowed_with_warning','blocked')),
occurred_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX ix_moderation_audit_tenant_time ON moderation_audit (tenant_id, occurred_at DESC);
ALTER TABLE moderation_audit ENABLE ROW LEVEL SECURITY;
CREATE POLICY moderation_audit_tenant_isolation ON moderation_audit
USING (tenant_id::text = current_setting('app.tenant_id', true));

-- Cache lives in Redis: key = sha256:<hash>; value = compact JSON of CachedResult
-- Postgres backstop is informational only:
CREATE TABLE prompt_cache_audit (
input_hash text PRIMARY KEY,
tenant_id uuid NOT NULL,
capability_key text NOT NULL,
cached_at timestamptz NOT NULL,
ttl_seconds int NOT NULL
);
ALTER TABLE prompt_cache_audit ENABLE ROW LEVEL SECURITY;
CREATE POLICY prompt_cache_audit_tenant_isolation ON prompt_cache_audit
USING (tenant_id::text = current_setting('app.tenant_id', true));

13. Outbox + Inbox (mandatory pattern)

CREATE TABLE outbox (
id text PRIMARY KEY,
aggregate_type text NOT NULL,
aggregate_id text NOT NULL,
event_type text NOT NULL,
event_payload jsonb NOT NULL,
tenant_id uuid NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
published_at timestamptz,
attempts int NOT NULL DEFAULT 0
);
CREATE INDEX ix_outbox_unpublished ON outbox (created_at) WHERE published_at IS NULL;

CREATE TABLE inbox (
message_id text PRIMARY KEY,
topic text NOT NULL,
received_at timestamptz NOT NULL DEFAULT now(),
processed_at timestamptz,
result text
);

14. Indexes summary (top hot paths)

IndexPurpose
ix_capabilities_statusHot read of active capabilities at boot + admin list
ix_prompt_versions_capability_statusLookup current active per capability
ix_inference_requests_tenant_capabilityPer-tenant per-capability audit + dashboards
ix_inference_results_tenant_provenanceProvenance lookup
ix_provenances_tenant_occurredTenant audit timeline
ix_budget_counters_tenant_periodBudget snapshot
ix_hitl_gates_tenant_statusOpen-gates list
ix_eval_runs_prompt_statusEval history per prompt
ix_embeddings_cloud_hnsw (HNSW)k-NN search (cloud embeddings)
ix_embeddings_edge_hnsw (HNSW)k-NN search (edge embeddings, replicated subset)
ix_moderation_audit_tenant_timePer-tenant moderation forensic
ix_outbox_unpublishedOutbox relay

15. RLS posture

Every multi-tenant table has RLS enabled, the <table>_tenant_isolation policy, and an application-side assertion that tenant_id matches the session GUC before any write. CI scans new migrations and fails the build if either is missing.

16. BigQuery streaming

Every inference.completed.v1 event is mirrored to melmastoon_analytics_prod.ai_calls_fact via Pub/Sub → Dataflow → BigQuery. Schema:

CREATE TABLE melmastoon_analytics_prod.ai_calls_fact (
request_id STRING NOT NULL,
tenant_id STRING NOT NULL,
capability_key STRING NOT NULL,
model_provider STRING NOT NULL,
model_name STRING NOT NULL,
prompt_version_id STRING,
cache_hit BOOL NOT NULL,
fallback_applied BOOL NOT NULL,
fallback_reason STRING,
tokens_in INT64 NOT NULL,
tokens_out INT64 NOT NULL,
cost_micros INT64 NOT NULL,
latency_ms INT64 NOT NULL,
occurred_at TIMESTAMP NOT NULL,
region STRING,
caller_service STRING,
surface STRING
)
PARTITION BY DATE(occurred_at)
CLUSTER BY tenant_id, capability_key;

Looker Studio dashboards over this table give the per-(tenant, capability, model) cost and acceptance views (see OBSERVABILITY.md).

17. Migrations summary

Migrations live in services/ai-orchestrator-service/migrations/. Every migration must be replayable on prod, additive-only, and accompanied by a backfill script if it changes the meaning of an existing column. Initial set:

0001_extensions.sql -- pgcrypto, pg_trgm, vector
0002_create_models_providers.sql
0003_create_prompts_versions.sql
0004_create_capabilities.sql
0005_create_inference_audit.sql
0006_create_provenances.sql
0007_create_budget_counters.sql
0008_create_hitl.sql
0009_create_eval_suites_runs.sql
0010_create_rag_corpora_chunks.sql
0011_create_embeddings_cloud_hnsw.sql
0012_create_embeddings_edge_hnsw.sql
0013_create_edge_model_manifests.sql
0014_create_moderation_audit.sql
0015_create_outbox_inbox.sql
0016_create_ab_assignments.sql
0017_create_tenant_capability_overrides.sql
0018_partition_inference_audit_by_month.sql

CI checks every new migration declares tenant_id + RLS where applicable.