Skip to main content

Data Model

:::info Source Sourced from services/assessment-service/DATA_MODEL.md in the documentation repo. :::

Database: Postgres 16 Schema: assessment Tenancy: RLS on every table via app.tenant_id session var Companion: DOMAIN_MODEL.md · docs/12-data-models.md


1. Schema overview

┌──────────────────────┐
│ quiz_banks │
└─────────┬────────────┘
│ 1..*
┌─────────▼────────────┐ ┌──────────────────┐
│ questions │ │ question_options │
│ (discriminated) │◀───1..*│ (for mcq/ms) │
└──────────────────────┘ └──────────────────┘
┌──────────────────────┐
│ branching_scenarios │
└─────────┬────────────┘
│ 1..*
┌─────────▼────────────┐
│ scenario_nodes │
│ (+ choices JSONB) │
└──────────────────────┘

┌─────────────────────────┐ ┌───────────────────────┐
│ attempt_results │───1..*│ responses │
└─────────────────────────┘ └───────────────────────┘


┌─────────────────────────┐
│ response_buffer │ (pre-scoring drafts)
└─────────────────────────┘

┌─────────────────────────┐
│ ai_grading_jobs │
└─────────────────────────┘

┌─────────────────────────┐ ┌───────────────────────┐
│ outbox_events │ │ inbox_events │
└─────────────────────────┘ └───────────────────────┘

┌─────────────────────────┐
│ block_quizbank_links │ (authoring ↔ assessment linkage)
└─────────────────────────┘

2. Tenancy & RLS

Every table includes tenant_id uuid NOT NULL. RLS is ON globally:

ALTER TABLE assessment.quiz_banks ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON assessment.quiz_banks
USING (tenant_id = current_setting('app.tenant_id')::uuid);

Pool wrapper issues SET LOCAL app.tenant_id = … per request; platform-admin context uses a separately audited app.tenant_id_bypass = true only on explicitly elevated sessions with four-eyes approval.


3. Table definitions

3.1 quiz_banks

CREATE TABLE assessment.quiz_banks (
id text PRIMARY KEY, -- 'qbk_' + ULID
tenant_id uuid NOT NULL,
title jsonb NOT NULL, -- I18nString
description jsonb,
grading_rule jsonb NOT NULL, -- GradingRule
pool_config jsonb,
time_limit_sec int,
state text NOT NULL
CHECK (state IN ('draft','published','archived')),
version int NOT NULL DEFAULT 1,
ai_provenance jsonb, -- AIProvenance or NULL
created_by text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
published_at timestamptz,
archived_at timestamptz,
course_version_id text,
CONSTRAINT qb_version_positive CHECK (version > 0)
);

CREATE INDEX qb_tenant_state_idx ON assessment.quiz_banks (tenant_id, state);
CREATE INDEX qb_course_version_idx ON assessment.quiz_banks (course_version_id) WHERE course_version_id IS NOT NULL;

3.2 questions (polymorphic)

CREATE TABLE assessment.questions (
id text PRIMARY KEY, -- 'qst_' + ULID
quiz_bank_id text NOT NULL REFERENCES assessment.quiz_banks(id) ON DELETE CASCADE,
tenant_id uuid NOT NULL,
kind text NOT NULL
CHECK (kind IN (
'mcq','multi_select','true_false','short_answer','numeric',
'ordering','matching','hotspot','drag_drop_classify','likert'
)),
prompt jsonb NOT NULL,
explanation jsonb,
media jsonb DEFAULT '[]'::jsonb,
tags text[] DEFAULT '{}'::text[],
weight numeric(6,3) NOT NULL CHECK (weight > 0),
difficulty text CHECK (difficulty IN ('easy','medium','hard')),
active boolean NOT NULL DEFAULT true,
-- Polymorphic body, enforced per-kind via CHECKs + JSON Schema on application side
body jsonb NOT NULL,
-- Answer keys live here, SEPARATE and encrypted:
answer_key_ciphertext bytea, -- AES-256-GCM
answer_key_kek_version int, -- which tenant key version
answer_key_nonce bytea,
ai_provenance jsonb,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX questions_bank_active_idx
ON assessment.questions (quiz_bank_id, active);
CREATE INDEX questions_tags_gin ON assessment.questions USING gin (tags);

Critical security note: body contains the presentation fields (prompt, options labels, etc.). answer_key_ciphertext contains the correctness flags / accepted answers / rubric. Read replicas may expose body freely but must never decrypt answer_key_ciphertext — the per-tenant KEK only exists in the primary cluster's KMS-bound instance role.

3.3 branching_scenarios

CREATE TABLE assessment.branching_scenarios (
id text PRIMARY KEY, -- 'scn_' + ULID
tenant_id uuid NOT NULL,
title jsonb NOT NULL,
description jsonb,
root_node_id text NOT NULL,
scoring jsonb NOT NULL,
state text NOT NULL
CHECK (state IN ('draft','published','archived')),
version int NOT NULL DEFAULT 1,
ai_provenance jsonb,
created_by text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now(),
published_at timestamptz,
archived_at timestamptz,
course_version_id text,
CONSTRAINT scn_version_positive CHECK (version > 0)
);
CREATE INDEX scn_tenant_state_idx ON assessment.branching_scenarios (tenant_id, state);

3.4 scenario_nodes

CREATE TABLE assessment.scenario_nodes (
scenario_id text NOT NULL REFERENCES assessment.branching_scenarios(id) ON DELETE CASCADE,
id text NOT NULL,
tenant_id uuid NOT NULL,
prompt jsonb NOT NULL,
media jsonb DEFAULT '[]'::jsonb,
choices jsonb NOT NULL, -- ScenarioChoice[]
is_terminal boolean NOT NULL,
terminal_outcome jsonb,
time_limit_sec int,
PRIMARY KEY (scenario_id, id)
);

3.5 attempt_results

CREATE TABLE assessment.attempt_results (
attempt_id text PRIMARY KEY, -- 'att_' + ULID (supplied by delivery-service)
tenant_id uuid NOT NULL,
user_id text NOT NULL,
quiz_bank_id text,
quiz_bank_version int,
scenario_id text,
course_version_id text,
raw_score numeric(10,4) NOT NULL,
max_score numeric(10,4) NOT NULL CHECK (max_score >= 0),
scaled_score numeric(6,4) NOT NULL
CHECK (scaled_score >= 0 AND scaled_score <= 1),
passed boolean NOT NULL,
duration_seconds int NOT NULL,
started_at timestamptz NOT NULL,
scored_at timestamptz NOT NULL,
scoring_mode text NOT NULL
CHECK (scoring_mode IN ('deterministic','ai_graded','mixed')),
offline_scored boolean NOT NULL DEFAULT false,
score_reconciliation jsonb,
regrade_of text REFERENCES assessment.attempt_results(attempt_id),
state text NOT NULL
CHECK (state IN ('final','pending_human_review','superseded')),
integrity_flags text[],
ai_provenance jsonb,
-- Short-answer texts are stored in 'responses' table; not denormalized here to minimize PII surface
CONSTRAINT exactly_one_target CHECK ((quiz_bank_id IS NOT NULL)::int + (scenario_id IS NOT NULL)::int = 1)
);

CREATE INDEX ar_tenant_user_idx ON assessment.attempt_results (tenant_id, user_id, scored_at DESC);
CREATE INDEX ar_tenant_quiz_idx ON assessment.attempt_results (tenant_id, quiz_bank_id);
CREATE INDEX ar_state_idx ON assessment.attempt_results (state) WHERE state = 'pending_human_review';

Immutability enforcement:

CREATE FUNCTION assessment.prevent_final_mutation()
RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
IF OLD.state = 'final' AND NEW.state != 'superseded' THEN
RAISE EXCEPTION 'attempt_result % is final and immutable', OLD.attempt_id;
END IF;
RETURN NEW;
END;
$$;

CREATE TRIGGER ar_immutable
BEFORE UPDATE ON assessment.attempt_results
FOR EACH ROW EXECUTE FUNCTION assessment.prevent_final_mutation();

3.6 responses

CREATE TABLE assessment.responses (
attempt_id text NOT NULL REFERENCES assessment.attempt_results(attempt_id) ON DELETE CASCADE,
question_id text, -- NULL for scenario path responses
tenant_id uuid NOT NULL,
seq int NOT NULL, -- order within attempt
kind text NOT NULL,
payload jsonb NOT NULL, -- full response including text/choice IDs
points_earned numeric(10,4) NOT NULL,
points_possible numeric(10,4) NOT NULL,
correct text NOT NULL
CHECK (correct IN ('true','false','partial','pending')),
duration_ms int NOT NULL,
answered_at timestamptz NOT NULL,
graded_by text NOT NULL
CHECK (graded_by IN ('deterministic','ai','human')),
ai_confidence numeric(4,3),
rubric_breakdown jsonb,
rationale text,
human_reviewer_id text,
human_reviewed_at timestamptz,
PRIMARY KEY (attempt_id, seq)
);

CREATE INDEX resp_question_idx ON assessment.responses (question_id, answered_at);

3.7 response_buffer

Pre-score draft responses (from POST /submit-response). Last-write-wins per (attempt_id, question_id).

CREATE TABLE assessment.response_buffer (
attempt_id text NOT NULL,
question_id text NOT NULL,
tenant_id uuid NOT NULL,
payload jsonb NOT NULL,
submitted_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (attempt_id, question_id)
);

TTL policy: rows older than 7 days and no linked attempt_results row → cron-purged.

3.8 ai_grading_jobs

CREATE TABLE assessment.ai_grading_jobs (
id text PRIMARY KEY, -- 'agj_' + ULID
tenant_id uuid NOT NULL,
attempt_id text NOT NULL,
question_id text NOT NULL,
state text NOT NULL
CHECK (state IN ('queued','in_flight','completed','failed','dlq')),
prompt_id text,
prompt_version text,
model text,
confidence numeric(4,3),
attempts int NOT NULL DEFAULT 0,
last_error text,
scheduled_at timestamptz NOT NULL DEFAULT now(),
completed_at timestamptz,
response_payload jsonb,
UNIQUE (attempt_id, question_id)
);

CREATE INDEX agj_state_idx ON assessment.ai_grading_jobs (state, scheduled_at)
WHERE state IN ('queued','in_flight');

3.9 outbox_events

CREATE TABLE assessment.outbox_events (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id uuid NOT NULL,
occurred_at timestamptz NOT NULL DEFAULT now(),
topic text NOT NULL,
partition_key text NOT NULL,
envelope jsonb NOT NULL,
published_at timestamptz,
attempts int NOT NULL DEFAULT 0,
last_error text
);
CREATE INDEX outbox_unpublished_idx
ON assessment.outbox_events (occurred_at)
WHERE published_at IS NULL;

3.10 inbox_events

CREATE TABLE assessment.inbox_events (
event_id text PRIMARY KEY,
topic text NOT NULL,
tenant_id uuid NOT NULL,
consumed_at timestamptz NOT NULL DEFAULT now(),
handler text NOT NULL,
outcome text NOT NULL
CHECK (outcome IN ('processed','skipped','errored'))
);
CREATE INDEX inbox_recent_idx ON assessment.inbox_events (consumed_at DESC);

Projection of authoring references to quiz banks. Maintained by UC-20.

CREATE TABLE assessment.block_quizbank_links (
block_ref text NOT NULL, -- e.g., 'blk_…'
quiz_bank_id text NOT NULL REFERENCES assessment.quiz_banks(id) ON DELETE CASCADE,
tenant_id uuid NOT NULL,
course_version_id text,
linked_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (block_ref, quiz_bank_id)
);

3.12 idempotency_keys

CREATE TABLE assessment.idempotency_keys (
key text PRIMARY KEY,
tenant_id uuid NOT NULL,
actor_id text NOT NULL,
route text NOT NULL,
request_hash text NOT NULL, -- SHA-256 of body
response jsonb,
created_at timestamptz NOT NULL DEFAULT now(),
expires_at timestamptz NOT NULL -- 24h
);
CREATE INDEX idem_expires_idx ON assessment.idempotency_keys (expires_at);

4. Storage sizing (estimates)

TableRows (yr 2, 10k tenants)Avg rowTotal
quiz_banks250 k2 kB~500 MB
questions5 M3 kB (incl. ciphertext)~15 GB
branching_scenarios50 k1 kB~50 MB
scenario_nodes1 M2 kB~2 GB
attempt_results250 M1 kB~250 GB
responses2 B600 B~1.2 TB
outbox_events (hot)30 d × 5 M/d = 150 M4 kB~600 GB (with purge to cold archive)

Cold archive to S3 (Parquet) after outbox publication + 30 days retention.


5. Indexing strategy

  • Hot read path (serve presentation): covered by qb_tenant_state_idx + per-question fetch by PK.
  • Attempt lookup by learner: ar_tenant_user_idx (tenant_id, user_id, scored_at DESC).
  • Pending human review queue: partial index ar_state_idx WHERE state = 'pending_human_review'.
  • AI grading job poller: partial index agj_state_idx.
  • Tag-based question search: questions_tags_gin for author queries.

6. Partitioning

  • attempt_results and responses partitioned by scored_at monthly; partitions older than 24 months detach and move to cold archive.
  • outbox_events partitioned by occurred_at weekly; old partitions dropped after 30 days (post-archival).

7. Backup & PITR

  • Automated base backup: daily, encrypted at rest with per-region KMS.
  • WAL archive: every 60s to geo-redundant object store; enables PITR at 60s granularity.
  • Retention: 30 days PITR; 7 years for monthly archives of attempt_results / responses (regulated class).
  • Restore drill: quarterly, scored via G8 gate in SERVICE_READINESS.md.

8. Encryption

DataAt restIn transitKey
body (question presentation)Postgres TDETLS 1.3Instance key
answer_key_ciphertextAES-256-GCM envelopeTLS 1.3Per-tenant KEK (KMS)
responses.payload (short-answer text)Postgres TDETLS 1.3Instance key
BackupsAES-256 (KMS)TLS 1.3Region KMS
OutboxPostgres TDETLS 1.3Instance key

Rotation: tenant KEKs rotate on the standard doc 13 key-rotation cadence (annual, or on incident).


9. Migrations

  • Tool: Sqitch (favored over Flyway for revert clarity). Each migration has deploy, verify, revert.
  • Rule: additive only in production; destructive changes require a 2-phase migration (expand, contract) over a release cycle.
  • CI: migration linter (e.g., squawk) blocks long locks, missing CONCURRENTLY on CREATE INDEX, etc.

10. Read replica strategy

  • One read replica per region for analytics-service and for expensive author read queries.
  • answer_key_ciphertext cannot be decrypted on replicas (KEK access role blocked on replica IAM).
  • No writes to replicas; no outbox consumption from replicas.

11. GDPR subject-request semantics

Request typeAction
ErasureUPDATE responses SET payload = jsonb_build_object('kind', kind, 'erased', true), points_earned=0… ; UPDATE attempt_results SET user_id='anon_…' ; re-emit attempt_result.scored.v1 with anonymized userId
ExportSerialize attempts + responses for this user, emit via sync-service export channel
RectificationNot applicable to learner responses

Retention override: legal-hold tag bypasses erasure until hold lifted; hold recorded in audit log.


12. References