Audit Service — User Stories
Service: audit-service Story prefix: AUDIT-US Last updated: 2026-04-18
Stories
AUDIT-US-001 — Ingest platform events via NATS wildcard consumer
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Wildcard NATS consumer ingests all platform events into audit_entries |
| Epic link | AUDIT-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 8 |
| Labels | service:audit-service, type:backend, slice:S0 |
| Components | audit-service |
| FR references | FR-AUDIT-001 |
| Legacy FR refs | PLAT-AUDIT-001 |
| Dependencies | — |
User story: As the platform compliance system, when any service emits a domain event to NATS, I want it automatically ingested into the tamper-evident audit store so that no compliance-relevant action is missing from the audit trail.
Acceptance criteria (Gherkin):
- Given a well-formed CloudEvent on any subscribed subject, when the consumer receives it, then an
AuditEntryrow is inserted within 200 ms P95 and the message is ACK'd. - Given the DB is unavailable, when a message arrives, then the consumer NAK's the message; it is redelivered by NATS JetStream on DB recovery.
- Given
source_event_idalready exists, when the same event is redelivered, then the INSERT is idempotently skipped and the message is ACK'd (no duplicate row).
Technical notes:
- Wildcard consumer subjects:
com.ghasi-ehr.>,patient_chart.>,ai_gateway.>,identity.>,tenant.>. - NATS
AckPolicy=Explicit; NAK on DB failure; re-delivery via JetStream. - Dedup:
INSERT ... ON CONFLICT (source_event_id) DO NOTHING.
Definition of Done:
ingestion.spec.tsintegration test passes.dedup.spec.tspassing.audit_events_ingested_totalmetric publishing.
AUDIT-US-002 — Compute chain hash on every audit entry
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Every AuditEntry linked to previous by SHA-256 chain hash |
| Epic link | AUDIT-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:audit-service, type:backend, slice:S0 |
| Components | audit-service |
| FR references | FR-AUDIT-002 |
| Legacy FR refs | PLAT-AUDIT-002 |
| Dependencies | AUDIT-US-001 |
User story: As a compliance officer, when I need to verify the audit trail has not been tampered with, I want each entry to contain a hash of the previous entry so that any modification to a stored record invalidates the chain.
Acceptance criteria (Gherkin):
- Given a new audit entry, when it is inserted, then
chain_hash = SHA-256(prev_id:sourceEventId:tenantId:occurredAt:resourceId). - Given a stored entry's
occurred_atis modified externally, when the verification job runs, then the chain-hash mismatch is detected andaudit_chain_integrity_failures_totalincrements. - Given the first entry in the store, when it is inserted, then
prev_idis"GENESIS"for the initial hash computation.
Technical notes:
ChainServiceis a pure domain function; fully unit-testable.- Chain computed within the ingestion transaction before INSERT.
Definition of Done:
chain-integrity.spec.tsmust pass.- Tamper detection unit test with manually modified hash.
AUDIT-US-003 — Enforce INSERT-only DB role for audit_entries
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | audit_app role cannot UPDATE or DELETE audit_entries |
| Epic link | AUDIT-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 2 |
| Labels | service:audit-service, type:backend, slice:S0 |
| Components | audit-service, PostgreSQL |
| FR references | FR-AUDIT-003 |
| Legacy FR refs | PLAT-AUDIT-003 |
| Dependencies | AUDIT-US-001 |
User story:
As the DBA, when I enforce immutability at the database engine level, I want the audit_app role to have INSERT-only permission on audit_entries so that no application code path — intentional or compromised — can modify or delete an audit record.
Acceptance criteria (Gherkin):
- Given the
audit_approle connected to Postgres, when an UPDATE is attempted onaudit_entries, then the DB returnsERROR: permission denied. - Given the
audit_approle, when a DELETE is attempted, thenERROR: permission denied. - Given a migration that attempts to add UPDATE permission to
audit_app, when CI runs the migration, then the CI security scan flags the migration for review.
Technical notes:
docker/init.sqlin dev:REVOKE UPDATE, DELETE ON audit_entries FROM audit_app.- Production: same SQL applied via Drizzle migration.
Definition of Done:
db-role-permissions.spec.tsmust pass (hard block in CI).- DB init SQL reviewed by DBA.
AUDIT-US-004 — Normalise and store AI gateway events
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | ai_gateway.* events normalised to AuditEntry with AI event taxonomy |
| Epic link | AUDIT-EPIC-01 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:audit-service, type:backend, slice:S0 |
| Components | audit-service, ai-gateway-service |
| FR references | FR-AUDIT-001 |
| Legacy FR refs | — |
| Dependencies | AUDIT-US-001, cross-service: AIGW-US-018 |
User story: As a compliance officer, when investigating AI-assisted clinical activity, I want all AI gateway events (assist requested, completed, decision accepted, moderation flagged) to appear in the audit trail so that I can produce a full accounting of AI usage for any regulatory inquiry.
Acceptance criteria (Gherkin):
- Given
ai_gateway.assist.completed.v1is published, when ingested, thenAuditEntry { eventType: AI_ASSIST_COMPLETED, resourceId: decisionId }is stored withprovenanceIdinmetadata. - Given
ai_gateway.decision.accepted.v1, when ingested, thenAuditEntry { eventType: AI_DECISION_ACCEPTED }stored. - Given any
ai_gateway.*event, when stored, thenmetadatadoes NOT contain raw prompt text.
Technical notes:
- Normalisation mapping:
ai_gateway.*→AuditEventType.AI_*taxonomy entries. metadatafield vetted against PHI checklist during normalisation.
Definition of Done:
- Schema conformance test for all 5
ai_gateway.*event types. - PHI absence test in metadata.
AUDIT-US-005 — Query audit entries with filters
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Compliance officer queries entries by actor, type, resource, date range |
| Epic link | AUDIT-EPIC-02 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:audit-service, type:api, slice:S0 |
| Components | audit-service |
| FR references | FR-AUDIT-004 |
| Legacy FR refs | PLAT-AUDIT-004 |
| Dependencies | AUDIT-US-001, cross-service: IDENT-US-001 |
User story: As a compliance officer, when I investigate a potential data breach or access violation, I want to query the audit trail by actor, event type, resource, and date range so that I can identify all relevant access events efficiently.
Acceptance criteria (Gherkin):
- Given I am a Tenant Admin, when I query
GET /api/v1/audit/entries?actorId=usr_X, then only entries from my tenant are returned. - Given a date range > 90 days, when I submit the query, then
400 AUD_DATE_RANGE_TOO_WIDEwith guidance to use async export. - Given I am a Super Admin, when I query with
tenantIdparam, then entries from that specific tenant are returned.
Technical notes:
- RLS ensures Tenant Admin can only see own tenant regardless of query params.
- Cursor pagination on
recorded_at DESC.
Definition of Done:
query-api.spec.tsintegration test; tenant-isolation verified.
AUDIT-US-006 — Get single audit entry by ID
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Retrieve a single audit entry by ID for investigation |
| Epic link | AUDIT-EPIC-02 |
| Status | To Do |
| Priority | Must |
| Story points | 1 |
| Labels | service:audit-service, type:api, slice:S0 |
| Components | audit-service |
| FR references | FR-AUDIT-004 |
| Legacy FR refs | PLAT-AUDIT-004 |
| Dependencies | AUDIT-US-005 |
User story: As a compliance officer, when I have a specific audit entry ID from an incident report, I want to retrieve the full entry details so that I can review the exact metadata, chain hash, and context of that event.
Acceptance criteria (Gherkin):
- Given a valid entry ID belonging to my tenant, when I call
GET /api/v1/audit/entries/:id, then the full entry is returned includingchainHashandmetadata. - Given an entry ID from a different tenant, when I call the endpoint, then
404is returned (no cross-tenant leak).
Definition of Done:
- Cross-tenant returns 404 (not 403 — do not leak existence of foreign entries).
AUDIT-US-007 — Accounting of disclosures for patient
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Patient-facing endpoint returning who accessed their record and when |
| Epic link | AUDIT-EPIC-02 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:audit-service, type:api, slice:S1 |
| Components | audit-service, patient-portal-service |
| FR references | FR-AUDIT-005 |
| Legacy FR refs | PLAT-AUDIT-005 |
| Dependencies | AUDIT-US-001, cross-service: PORTAL-US-001 |
User story: As a patient, when I want to know who has accessed my health record, I want to view the accounting-of-disclosures so that I can exercise my rights under the HIPAA analogue and GDPR transparency obligation.
Acceptance criteria (Gherkin):
- Given I am authenticated as a patient, when I call
GET /api/v1/audit/disclosures?patientId=pat_X, then I receive entries whereresourceId=pat_Xandaction=READ. - Given I request disclosures for a different patient, when the request is made, then
403 FORBIDDEN. - Given no access events exist for the patient, when disclosure is queried, then
{ data: [], total: 0 }.
Definition of Done:
- Patient scope enforced by JWT
subclaim. Integration test: patient sees only own disclosures.
AUDIT-US-008 — Request async audit export
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Super Admin requests async NDJSON or CSV export of audit entries |
| Epic link | AUDIT-EPIC-03 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:audit-service, type:backend, slice:S1 |
| Components | audit-service, object-storage |
| FR references | FR-AUDIT-006 |
| Legacy FR refs | PLAT-AUDIT-006 |
| Dependencies | AUDIT-US-005 |
User story: As a Super Admin responding to an MoPH regulatory request, when I need to export a large date range of audit entries, I want to request an async export so that I receive a downloadable file without timing out or impacting live query performance.
Acceptance criteria (Gherkin):
- Given I am a Super Admin, when I POST
/api/v1/audit/exportswith format and filters, thenAuditExportis created withstatus=queuedand202 Acceptedreturned. - Given the export job completes, when I poll
GET /api/v1/audit/exports/:id, thenstatus=completedandfileUrl(signed URL) is present. - Given the export itself, when submitted, then an
AuditEntry { eventType: BULK_EXPORT }is created (meta-audit).
Technical notes:
AuditExportstate machine:queued → processing → completed / failed.fileUrlis a signed object-storage URL with 1-hour TTL.
Definition of Done:
export.spec.tsintegration test with MinIO. Meta-audit entry verified.
AUDIT-US-009 — Download completed export
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Super Admin downloads export file via signed URL |
| Epic link | AUDIT-EPIC-03 |
| Status | To Do |
| Priority | Must |
| Story points | 2 |
| Labels | service:audit-service, type:api, slice:S1 |
| Components | audit-service |
| FR references | FR-AUDIT-007 |
| Legacy FR refs | PLAT-AUDIT-006 |
| Dependencies | AUDIT-US-008 |
User story: As a Super Admin, when my export job is complete, I want to download the file via the signed URL so that I can transfer it to MoPH or an offline analysis tool securely.
Acceptance criteria (Gherkin):
- Given a completed export, when I access the
fileUrl, then the NDJSON or CSV file downloads successfully. - Given the signed URL has expired (> 1 hour), when I access it, then
403 Forbiddenfrom object storage. - Given the export failed, when I query the status, then
status=failedandfileUrl=null.
Definition of Done:
- Signed URL TTL tested. Failed export status verified.
AUDIT-US-010 — Chain integrity verification job
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Scheduled job verifies SHA-256 chain hash on all audit entries |
| Epic link | AUDIT-EPIC-04 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:audit-service, type:backend, slice:S1 |
| Components | audit-service |
| FR references | FR-AUDIT-008 |
| Legacy FR refs | PLAT-AUDIT-007 |
| Dependencies | AUDIT-US-002 |
User story: As a compliance officer, when the daily chain integrity check runs, I want any tampering with the audit store to be detected and alerted within 24 hours so that the platform's tamper-evident guarantee is continuously verified.
Acceptance criteria (Gherkin):
- Given all entries are intact, when the verification job runs, then
audit_chain_integrity_failures_totalremains 0 and job completes within 5 min. - Given an entry's
occurred_atis modified externally, when the job runs, then the mismatch is detected andaudit_chain_integrity_failures_totalincrements by 1. - Given the job runs, when it completes, then
audit_chain_last_verified_atgauge is updated.
Technical notes:
- Job scheduled via NestJS
@Cron; configurable viaCHAIN_INTEGRITY_JOB_CRONenv var. - Partition-pruned to configurable window (default: last 7 days; full scan configurable).
Definition of Done:
chain-integrity.spec.tspasses.AuditChainIntegrityFailedalert tested in staging.
AUDIT-US-011 — On-demand chain verification endpoint (admin)
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Super Admin triggers manual chain verification for a date range |
| Epic link | AUDIT-EPIC-04 |
| Status | To Do |
| Priority | Could |
| Story points | 3 |
| Labels | service:audit-service, type:api, slice:S2 |
| Components | audit-service |
| FR references | FR-AUDIT-008 |
| Legacy FR refs | — |
| Dependencies | AUDIT-US-010 |
User story: As a Super Admin after a security incident, when I need to immediately verify the integrity of a specific date range rather than waiting for the nightly job, I want to trigger a manual chain verification so that I can provide immediate assurance to regulators.
Acceptance criteria (Gherkin):
- Given I am a Super Admin, when I POST
/api/v1/audit/verify-chain?dateFrom=X&dateTo=Y, then verification runs and returns{ verified: true, entriesChecked: N }or{ verified: false, firstFailureId: 'aud_...' }. - Given a large range, when verification is requested, then it runs asynchronously and a job ID is returned.
Definition of Done:
- Endpoint documented in API_CONTRACTS.md.
AUDIT-US-012 — DLQ event retry and isolation
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Dead-lettered events retried and isolated on exhaustion |
| Epic link | AUDIT-EPIC-05 |
| Status | To Do |
| Priority | Should |
| Story points | 5 |
| Labels | service:audit-service, type:backend, slice:S1 |
| Components | audit-service |
| FR references | FR-AUDIT-009 |
| Legacy FR refs | PLAT-AUDIT-008 |
| Dependencies | AUDIT-US-001 |
User story: As an SRE, when a source service emits a malformed event that fails ingestion after multiple retries, I want the event isolated in a DLQ table so that the audit service continues processing subsequent events without being blocked.
Acceptance criteria (Gherkin):
- Given a malformed event that fails normalisation, when the DLQ consumer receives it after 3 retry attempts, then the raw payload is stored in
audit_dlq_entrieswithnormalisation_error=true. - Given a DLQ event is stored, when
audit.dlq.alert.v1is emitted, then platform-admin-service receives it within 30 s. - Given DLQ events are accumulating, when
audit_dlq_pending_messages> 0, thenAuditDLQGrowingalert fires within 2 min.
Definition of Done:
- DLQ table created via migration. Alert tested in staging.
AUDIT-US-013 — DLQ entries inspection via admin API
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Super Admin views and reprocesses DLQ entries |
| Epic link | AUDIT-EPIC-05 |
| Status | To Do |
| Priority | Could |
| Story points | 3 |
| Labels | service:audit-service, type:api, slice:S2 |
| Components | audit-service |
| FR references | FR-AUDIT-009 |
| Legacy FR refs | — |
| Dependencies | AUDIT-US-012 |
User story: As an SRE, when I am investigating a DLQ alert, I want to view the raw payloads of dead-lettered events so that I can identify the source service schema violation and decide whether manual reprocessing is needed.
Acceptance criteria (Gherkin):
- Given DLQ entries exist, when I call
GET /api/v1/audit/dlq, then I receive a list of DLQ entries with raw payload and error message. - Given I patch a DLQ entry with the correct payload, when I trigger reprocessing, then the entry is re-ingested and removed from the DLQ.
Definition of Done:
- Admin API documented. Reprocessing integration test.
AUDIT-US-014 — Observability instrumentation
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | OTEL spans and Prometheus metrics for all ingestion and query operations |
| Epic link | AUDIT-EPIC-06 |
| Status | To Do |
| Priority | Must |
| Story points | 5 |
| Labels | service:audit-service, type:backend, slice:S0 |
| Components | audit-service, observability |
| FR references | FR-AUDIT-010 |
| Legacy FR refs | — |
| Dependencies | AUDIT-US-001 |
User story: As an SRE, when I monitor the audit service, I want full distributed traces and Prometheus metrics for every ingestion and query operation so that I can detect performance degradation and ingestion gaps in real time.
Acceptance criteria (Gherkin):
- Given an event is ingested, when I query Grafana Tempo, then I see a trace with spans:
audit.dedup_check,audit.normalise_event,audit.compute_chain_hash,audit.insert_entry. - Given ingestion stops, when no events are received for > 5 min, then
AuditIngestionStoppedalert fires. - Given a query is made, when it completes, then
audit_query_duration_mshistogram has a new observation.
Definition of Done:
- Traces visible in Tempo staging. All metrics in OBSERVABILITY.md §2 publishing.
AuditIngestionStoppedalert tested.
AUDIT-US-015 — Grafana dashboards and SLO burn-rate alerts
| Field | Value |
|---|---|
| Issue type | Story |
| Summary | Audit service dashboards and SLO burn-rate alerts deployed |
| Epic link | AUDIT-EPIC-06 |
| Status | To Do |
| Priority | Must |
| Story points | 3 |
| Labels | service:audit-service, type:backend, slice:S0 |
| Components | audit-service, observability |
| FR references | FR-AUDIT-010 |
| Legacy FR refs | — |
| Dependencies | AUDIT-US-014 |
User story: As an SRE, when I am on call, I want pre-built Grafana dashboards and SLO burn-rate alerts for the audit service so that I can immediately understand service health and respond to incidents with runbook guidance.
Acceptance criteria (Gherkin):
- Given the staging environment, when dashboards are deployed, then "Audit Service — Ingestion", "Chain Integrity", and "SLO Burn" dashboards are visible in Grafana.
- Given ingestion availability drops below 99.9 %, when the burn-rate alert fires, then the alert includes a link to
/runbooks/audit/ingestion-stopped.
Definition of Done:
- Dashboard JSON exported to
infra/grafana/dashboards/audit-service/. All 6 alerts in OBSERVABILITY.md configured in Alertmanager. Runbooks linked.