Data Model
:::info Source
Sourced from services/assessment-service/DATA_MODEL.md in the documentation repo.
:::
Database: Postgres 16 Schema:
assessmentTenancy: RLS on every table viaapp.tenant_idsession var Companion: DOMAIN_MODEL.md · docs/12-data-models.md
1. Schema overview
┌──────────────────────┐
│ quiz_banks │
└─────────┬────────────┘
│ 1..*
┌─────────▼────────────┐ ┌──────────────────┐
│ questions │ │ question_options │
│ (discriminated) │◀───1..*│ (for mcq/ms) │
└──────────────────────┘ └──────────────────┘
┌──────────────────────┐
│ branching_scenarios │
└─────────┬────────────┘
│ 1..*
┌─────────▼────────────┐
│ scenario_nodes │
│ (+ choices JSONB) │
└──────────────────────┘
┌─────────────────────────┐ ┌───────────────────────┐
│ attempt_results │───1..*│ responses │
└─────────────────────────┘ └───────────────────────┘
│
▼
┌─────────────────────────┐
│ response_buffer │ (pre-scoring drafts)
└─────────────────────────┘
┌─────────────────────────┐
│ ai_grading_jobs │
└─────────────────────────┘
┌─────────────────────────┐ ┌───────────────────────┐
│ outbox_events │ │ inbox_events │
└─────────────────────────┘ └───────────────────────┘
┌─────────────────────────┐
│ block_quizbank_links │ (authoring ↔ assessment linkage)
└─────────────────────────┘
2. Tenancy & RLS
Every table includes tenant_id uuid NOT NULL. RLS is ON globally:
ALTER TABLE assessment.quiz_banks ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON assessment.quiz_banks
USING (tenant_id = current_setting('app.tenant_id')::uuid);
Pool wrapper issues SET LOCAL app.tenant_id = … per request; platform-admin context uses a separately audited app.tenant_id_bypass = true only on explicitly elevated sessions with four-eyes approval.
3. Table definitions
3.1 quiz_banks
CREATE TABLE assessment.quiz_banks (
id text PRIMARY KEY, -- 'qbk_' + ULID
tenant_id uuid NOT NULL,
title jsonb NOT NULL, -- I18nString
description jsonb,
grading_rule jsonb NOT NULL, -- GradingRule
pool_config jsonb,
time_limit_sec int,
state text NOT NULL
CHECK (state IN ('draft','published','archived')),
version int NOT NULL DEFAULT 1,
ai_provenance jsonb, -- AIProvenance or NULL
created_by text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
published_at timestamptz,
archived_at timestamptz,
course_version_id text,
CONSTRAINT qb_version_positive CHECK (version > 0)
);
CREATE INDEX qb_tenant_state_idx ON assessment.quiz_banks (tenant_id, state);
CREATE INDEX qb_course_version_idx ON assessment.quiz_banks (course_version_id) WHERE course_version_id IS NOT NULL;
3.2 questions (polymorphic)
CREATE TABLE assessment.questions (
id text PRIMARY KEY, -- 'qst_' + ULID
quiz_bank_id text NOT NULL REFERENCES assessment.quiz_banks(id) ON DELETE CASCADE,
tenant_id uuid NOT NULL,
kind text NOT NULL
CHECK (kind IN (
'mcq','multi_select','true_false','short_answer','numeric',
'ordering','matching','hotspot','drag_drop_classify','likert'
)),
prompt jsonb NOT NULL,
explanation jsonb,
media jsonb DEFAULT '[]'::jsonb,
tags text[] DEFAULT '{}'::text[],
weight numeric(6,3) NOT NULL CHECK (weight > 0),
difficulty text CHECK (difficulty IN ('easy','medium','hard')),
active boolean NOT NULL DEFAULT true,
-- Polymorphic body, enforced per-kind via CHECKs + JSON Schema on application side
body jsonb NOT NULL,
-- Answer keys live here, SEPARATE and encrypted:
answer_key_ciphertext bytea, -- AES-256-GCM
answer_key_kek_version int, -- which tenant key version
answer_key_nonce bytea,
ai_provenance jsonb,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX questions_bank_active_idx
ON assessment.questions (quiz_bank_id, active);
CREATE INDEX questions_tags_gin ON assessment.questions USING gin (tags);
Critical security note: body contains the presentation fields (prompt, options labels, etc.). answer_key_ciphertext contains the correctness flags / accepted answers / rubric. Read replicas may expose body freely but must never decrypt answer_key_ciphertext — the per-tenant KEK only exists in the primary cluster's KMS-bound instance role.
3.3 branching_scenarios
CREATE TABLE assessment.branching_scenarios (
id text PRIMARY KEY, -- 'scn_' + ULID
tenant_id uuid NOT NULL,
title jsonb NOT NULL,
description jsonb,
root_node_id text NOT NULL,
scoring jsonb NOT NULL,
state text NOT NULL
CHECK (state IN ('draft','published','archived')),
version int NOT NULL DEFAULT 1,
ai_provenance jsonb,
created_by text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
published_at timestamptz,
archived_at timestamptz,
course_version_id text,
CONSTRAINT scn_version_positive CHECK (version > 0)
);
CREATE INDEX scn_tenant_state_idx ON assessment.branching_scenarios (tenant_id, state);
3.4 scenario_nodes
CREATE TABLE assessment.scenario_nodes (
scenario_id text NOT NULL REFERENCES assessment.branching_scenarios(id) ON DELETE CASCADE,
id text NOT NULL,
tenant_id uuid NOT NULL,
prompt jsonb NOT NULL,
media jsonb DEFAULT '[]'::jsonb,
choices jsonb NOT NULL, -- ScenarioChoice[]
is_terminal boolean NOT NULL,
terminal_outcome jsonb,
time_limit_sec int,
PRIMARY KEY (scenario_id, id)
);
3.5 attempt_results
CREATE TABLE assessment.attempt_results (
attempt_id text PRIMARY KEY, -- 'att_' + ULID (supplied by delivery-service)
tenant_id uuid NOT NULL,
user_id text NOT NULL,
quiz_bank_id text,
quiz_bank_version int,
scenario_id text,
course_version_id text,
raw_score numeric(10,4) NOT NULL,
max_score numeric(10,4) NOT NULL CHECK (max_score >= 0),
scaled_score numeric(6,4) NOT NULL
CHECK (scaled_score >= 0 AND scaled_score <= 1),
passed boolean NOT NULL,
duration_seconds int NOT NULL,
started_at timestamptz NOT NULL,
scored_at timestamptz NOT NULL,
scoring_mode text NOT NULL
CHECK (scoring_mode IN ('deterministic','ai_graded','mixed')),
offline_scored boolean NOT NULL DEFAULT false,
score_reconciliation jsonb,
regrade_of text REFERENCES assessment.attempt_results(attempt_id),
state text NOT NULL
CHECK (state IN ('final','pending_human_review','superseded')),
integrity_flags text[],
ai_provenance jsonb,
-- Short-answer texts are stored in 'responses' table; not denormalized here to minimize PII surface
CONSTRAINT exactly_one_target CHECK ((quiz_bank_id IS NOT NULL)::int + (scenario_id IS NOT NULL)::int = 1)
);
CREATE INDEX ar_tenant_user_idx ON assessment.attempt_results (tenant_id, user_id, scored_at DESC);
CREATE INDEX ar_tenant_quiz_idx ON assessment.attempt_results (tenant_id, quiz_bank_id);
CREATE INDEX ar_state_idx ON assessment.attempt_results (state) WHERE state = 'pending_human_review';
Immutability enforcement:
CREATE FUNCTION assessment.prevent_final_mutation()
RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
IF OLD.state = 'final' AND NEW.state != 'superseded' THEN
RAISE EXCEPTION 'attempt_result % is final and immutable', OLD.attempt_id;
END IF;
RETURN NEW;
END;
$$;
CREATE TRIGGER ar_immutable
BEFORE UPDATE ON assessment.attempt_results
FOR EACH ROW EXECUTE FUNCTION assessment.prevent_final_mutation();
3.6 responses
CREATE TABLE assessment.responses (
attempt_id text NOT NULL REFERENCES assessment.attempt_results(attempt_id) ON DELETE CASCADE,
question_id text, -- NULL for scenario path responses
tenant_id uuid NOT NULL,
seq int NOT NULL, -- order within attempt
kind text NOT NULL,
payload jsonb NOT NULL, -- full response including text/choice IDs
points_earned numeric(10,4) NOT NULL,
points_possible numeric(10,4) NOT NULL,
correct text NOT NULL
CHECK (correct IN ('true','false','partial','pending')),
duration_ms int NOT NULL,
answered_at timestamptz NOT NULL,
graded_by text NOT NULL
CHECK (graded_by IN ('deterministic','ai','human')),
ai_confidence numeric(4,3),
rubric_breakdown jsonb,
rationale text,
human_reviewer_id text,
human_reviewed_at timestamptz,
PRIMARY KEY (attempt_id, seq)
);
CREATE INDEX resp_question_idx ON assessment.responses (question_id, answered_at);
3.7 response_buffer
Pre-score draft responses (from POST /submit-response). Last-write-wins per (attempt_id, question_id).
CREATE TABLE assessment.response_buffer (
attempt_id text NOT NULL,
question_id text NOT NULL,
tenant_id uuid NOT NULL,
payload jsonb NOT NULL,
submitted_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (attempt_id, question_id)
);
TTL policy: rows older than 7 days and no linked attempt_results row → cron-purged.
3.8 ai_grading_jobs
CREATE TABLE assessment.ai_grading_jobs (
id text PRIMARY KEY, -- 'agj_' + ULID
tenant_id uuid NOT NULL,
attempt_id text NOT NULL,
question_id text NOT NULL,
state text NOT NULL
CHECK (state IN ('queued','in_flight','completed','failed','dlq')),
prompt_id text,
prompt_version text,
model text,
confidence numeric(4,3),
attempts int NOT NULL DEFAULT 0,
last_error text,
scheduled_at timestamptz NOT NULL DEFAULT now(),
completed_at timestamptz,
response_payload jsonb,
UNIQUE (attempt_id, question_id)
);
CREATE INDEX agj_state_idx ON assessment.ai_grading_jobs (state, scheduled_at)
WHERE state IN ('queued','in_flight');
3.9 outbox_events
CREATE TABLE assessment.outbox_events (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id uuid NOT NULL,
occurred_at timestamptz NOT NULL DEFAULT now(),
topic text NOT NULL,
partition_key text NOT NULL,
envelope jsonb NOT NULL,
published_at timestamptz,
attempts int NOT NULL DEFAULT 0,
last_error text
);
CREATE INDEX outbox_unpublished_idx
ON assessment.outbox_events (occurred_at)
WHERE published_at IS NULL;
3.10 inbox_events
CREATE TABLE assessment.inbox_events (
event_id text PRIMARY KEY,
topic text NOT NULL,
tenant_id uuid NOT NULL,
consumed_at timestamptz NOT NULL DEFAULT now(),
handler text NOT NULL,
outcome text NOT NULL
CHECK (outcome IN ('processed','skipped','errored'))
);
CREATE INDEX inbox_recent_idx ON assessment.inbox_events (consumed_at DESC);
3.11 block_quizbank_links
Projection of authoring references to quiz banks. Maintained by UC-20.
CREATE TABLE assessment.block_quizbank_links (
block_ref text NOT NULL, -- e.g., 'blk_…'
quiz_bank_id text NOT NULL REFERENCES assessment.quiz_banks(id) ON DELETE CASCADE,
tenant_id uuid NOT NULL,
course_version_id text,
linked_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (block_ref, quiz_bank_id)
);
3.12 idempotency_keys
CREATE TABLE assessment.idempotency_keys (
key text PRIMARY KEY,
tenant_id uuid NOT NULL,
actor_id text NOT NULL,
route text NOT NULL,
request_hash text NOT NULL, -- SHA-256 of body
response jsonb,
created_at timestamptz NOT NULL DEFAULT now(),
expires_at timestamptz NOT NULL -- 24h
);
CREATE INDEX idem_expires_idx ON assessment.idempotency_keys (expires_at);
4. Storage sizing (estimates)
| Table | Rows (yr 2, 10k tenants) | Avg row | Total |
|---|---|---|---|
quiz_banks | 250 k | 2 kB | ~500 MB |
questions | 5 M | 3 kB (incl. ciphertext) | ~15 GB |
branching_scenarios | 50 k | 1 kB | ~50 MB |
scenario_nodes | 1 M | 2 kB | ~2 GB |
attempt_results | 250 M | 1 kB | ~250 GB |
responses | 2 B | 600 B | ~1.2 TB |
outbox_events (hot) | 30 d × 5 M/d = 150 M | 4 kB | ~600 GB (with purge to cold archive) |
Cold archive to S3 (Parquet) after outbox publication + 30 days retention.
5. Indexing strategy
- Hot read path (serve presentation): covered by
qb_tenant_state_idx+ per-question fetch by PK. - Attempt lookup by learner:
ar_tenant_user_idx (tenant_id, user_id, scored_at DESC). - Pending human review queue: partial index
ar_state_idx WHERE state = 'pending_human_review'. - AI grading job poller: partial index
agj_state_idx. - Tag-based question search:
questions_tags_ginfor author queries.
6. Partitioning
attempt_resultsandresponsespartitioned byscored_atmonthly; partitions older than 24 months detach and move to cold archive.outbox_eventspartitioned byoccurred_atweekly; old partitions dropped after 30 days (post-archival).
7. Backup & PITR
- Automated base backup: daily, encrypted at rest with per-region KMS.
- WAL archive: every 60s to geo-redundant object store; enables PITR at 60s granularity.
- Retention: 30 days PITR; 7 years for monthly archives of
attempt_results/responses(regulated class). - Restore drill: quarterly, scored via G8 gate in SERVICE_READINESS.md.
8. Encryption
| Data | At rest | In transit | Key |
|---|---|---|---|
body (question presentation) | Postgres TDE | TLS 1.3 | Instance key |
answer_key_ciphertext | AES-256-GCM envelope | TLS 1.3 | Per-tenant KEK (KMS) |
responses.payload (short-answer text) | Postgres TDE | TLS 1.3 | Instance key |
| Backups | AES-256 (KMS) | TLS 1.3 | Region KMS |
| Outbox | Postgres TDE | TLS 1.3 | Instance key |
Rotation: tenant KEKs rotate on the standard doc 13 key-rotation cadence (annual, or on incident).
9. Migrations
- Tool: Sqitch (favored over Flyway for revert clarity). Each migration has
deploy,verify,revert. - Rule: additive only in production; destructive changes require a 2-phase migration (expand, contract) over a release cycle.
- CI: migration linter (e.g.,
squawk) blocks long locks, missingCONCURRENTLYon CREATE INDEX, etc.
10. Read replica strategy
- One read replica per region for analytics-service and for expensive author read queries.
answer_key_ciphertextcannot be decrypted on replicas (KEK access role blocked on replica IAM).- No writes to replicas; no outbox consumption from replicas.
11. GDPR subject-request semantics
| Request type | Action |
|---|---|
| Erasure | UPDATE responses SET payload = jsonb_build_object('kind', kind, 'erased', true), points_earned=0… ; UPDATE attempt_results SET user_id='anon_…' ; re-emit attempt_result.scored.v1 with anonymized userId |
| Export | Serialize attempts + responses for this user, emit via sync-service export channel |
| Rectification | Not applicable to learner responses |
Retention override: legal-hold tag bypasses erasure until hold lifted; hold recorded in audit log.
12. References
- Platform data-model primitives: docs/12-data-models.md
- Tenancy rules: docs/13-security-compliance-tenancy.md §4
- Domain aggregates: DOMAIN_MODEL.md