Skip to main content

Audit Service — Epics

Service: audit-service Epic prefix: AUDIT-EPIC Last updated: 2026-04-18

Epics


AUDIT-EPIC-01 — Tamper-Evident Event Ingestion

FieldValue
Issue typeEpic
SummaryIngest all platform events into immutable, chain-hashed audit store
StatusTo Do
PriorityMust
Labelsservice:audit-service, domain:audit, slice:S0
Componentsaudit-service, NATS JetStream, PostgreSQL
Fix versionM0
FR referencesFR-AUDIT-001, FR-AUDIT-002, FR-AUDIT-003
Legacy FR refsPLAT-AUDIT-001..003 (from _sources/audit/)
Dependenciescross-service: IDENT-EPIC-01, TENANT-EPIC-01
Rollup statusNot started

Business outcome: Every compliance-relevant platform event is permanently recorded in a tamper-evident store, satisfying HIPAA analogue, GDPR audit, and Afghanistan MoPH audit obligations from the day the platform goes live.

Description: Implement the wildcard NATS JetStream consumer that subscribes to all platform event streams. Each received event is normalised into an AuditEntry, deduplicated on source_event_id, and stored with a SHA-256 chain hash linking it to the previous entry. The audit_app Postgres role enforces INSERT-only access at the DB engine level — no application path can UPDATE or DELETE an entry. Monthly partitioning manages table growth. Success criteria: all source service events appear in audit_entries within 200 ms P95 of emission.

Stories: AUDIT-US-001, AUDIT-US-002, AUDIT-US-003, AUDIT-US-004


AUDIT-EPIC-02 — Compliance Query and Disclosure API

FieldValue
Issue typeEpic
SummaryRole-scoped audit query API for compliance officers and tenant admins
StatusTo Do
PriorityMust
Labelsservice:audit-service, domain:audit, slice:S0
Componentsaudit-service, Kong
Fix versionM0
FR referencesFR-AUDIT-004, FR-AUDIT-005
Legacy FR refsPLAT-AUDIT-004..005
DependenciesAUDIT-EPIC-01, cross-service: IDENT-EPIC-01
Rollup statusNot started

Business outcome: Compliance officers and tenant admins can query the audit trail in real time for investigations, access reviews, and regulatory spot-checks, with tenant isolation enforced automatically.

Description: Implement GET /api/v1/audit/entries with multi-field filtering (actor, event type, resource, date range) and cursor pagination. GET /api/v1/audit/entries/:id for single-record retrieval. GET /api/v1/audit/disclosures scoped to a specific patient for HIPAA accounting-of-disclosures. Live queries are limited to 90-day windows. Tenant Admins auto-scoped to their tenant by JWT claim; Super Admins see all. Success criteria: query returns correct tenant-scoped results; cross-tenant query returns empty/403.

Stories: AUDIT-US-005, AUDIT-US-006, AUDIT-US-007


AUDIT-EPIC-03 — Async Export (NDJSON / CSV)

FieldValue
Issue typeEpic
SummaryAsync export of audit entries with signed download URL
StatusTo Do
PriorityMust
Labelsservice:audit-service, domain:audit, slice:S1
Componentsaudit-service, object-storage
Fix versionM1
FR referencesFR-AUDIT-006, FR-AUDIT-007
Legacy FR refsPLAT-AUDIT-006
DependenciesAUDIT-EPIC-01, AUDIT-EPIC-02
Rollup statusNot started

Business outcome: Regulators (MoPH) and compliance teams can export large audit datasets for offline analysis, forensic investigation, and national health information system reporting without impacting live query performance.

Description: Implement POST /api/v1/audit/exports to queue an AuditExport job (Super Admin only). Background worker streams matching rows to NDJSON or CSV file in object storage. On completion, a signed download URL (1-hour TTL) is set on the export record. The export request itself creates an AuditEntry (meta-audit). GET /api/v1/audit/exports/:id polls job status. Success criteria: 1 M row export completes in < 10 min P95; file is accessible via signed URL.

Stories: AUDIT-US-008, AUDIT-US-009


AUDIT-EPIC-04 — Chain Integrity Verification

FieldValue
Issue typeEpic
SummaryScheduled chain-hash verification job detects any tampering
StatusTo Do
PriorityMust
Labelsservice:audit-service, domain:audit, slice:S1
Componentsaudit-service
Fix versionM1
FR referencesFR-AUDIT-008
Legacy FR refsPLAT-AUDIT-007
DependenciesAUDIT-EPIC-01
Rollup statusNot started

Business outcome: Any tampering with the audit store — whether by an insider or a compromised process — is detected within 24 hours, satisfying the tamper-evident guarantee that underpins the platform's compliance posture.

Description: Implement a scheduled job (cron: 0 2 * * *) that reads audit_entries in recorded_at order and recomputes the SHA-256 chain hash for each row. On any mismatch, audit_chain_integrity_failures_total counter is incremented and AuditChainIntegrityFailed alert fires. The job must complete within 5 minutes for 1 M rows (partition-pruned to configurable window). Success criteria: tampered row detected in < 24 hours; verification job produces Prometheus metric on every run.

Stories: AUDIT-US-010, AUDIT-US-011


AUDIT-EPIC-05 — DLQ Handler and Event Quality

FieldValue
Issue typeEpic
SummaryDead-letter queue handling and source event quality alerting
StatusTo Do
PriorityShould
Labelsservice:audit-service, domain:audit, slice:S1
Componentsaudit-service, platform-admin-service
Fix versionM1
FR referencesFR-AUDIT-009
Legacy FR refsPLAT-AUDIT-008
DependenciesAUDIT-EPIC-01, cross-service: PLTADM-EPIC-01
Rollup statusNot started

Business outcome: Events that cannot be ingested (malformed schema, max retries exceeded) are isolated and surfaced to the platform operations team rather than silently dropped, ensuring the audit trail is complete or a known gap is documented.

Description: Implement the DLQ NATS consumer for audit.dlq subject. On message arrival, attempt schema coercion 3 times. On exhaustion, store the raw payload with normalisation_error=true flag in a audit_dlq_entries table and emit audit.dlq.alert.v1 to platform-admin-service. audit_dlq_pending_messages metric and AuditDLQGrowing alert notify SRE. DLQ entries are inspectable via an admin API. Success criteria: malformed event reaches DLQ table within 30 s; alert fires within 2 min.

Stories: AUDIT-US-012, AUDIT-US-013


AUDIT-EPIC-06 — Audit Service Observability and SLOs

FieldValue
Issue typeEpic
SummaryFull OTEL instrumentation, Prometheus metrics, and SLO dashboards
StatusTo Do
PriorityMust
Labelsservice:audit-service, domain:audit, slice:S0
Componentsaudit-service, observability
Fix versionM0
FR referencesFR-AUDIT-010
Legacy FR refs
DependenciesAUDIT-EPIC-01
Rollup statusNot started

Business outcome: SRE can detect ingestion failures, query performance degradation, and chain-integrity issues within minutes, enabling rapid response before compliance obligations are breached.

Description: Instrument every ingestion, query, and export operation with OTEL spans. Publish audit_events_ingested_total, audit_ingestion_duration_ms, audit_chain_integrity_failures_total, and all other metrics in OBSERVABILITY.md. Deploy Grafana dashboards "Audit Service — Ingestion", "Chain Integrity", and "SLO Burn". Configure all 6 alerts (OBSERVABILITY.md §6) in Alertmanager with linked runbooks. Success criteria: all dashboards live in staging; AuditIngestionStopped alert fires within 5 min of simulated NATS failure.

Stories: AUDIT-US-014, AUDIT-US-015