Audit Service — Epics
Service: audit-service Epic prefix: AUDIT-EPIC Last updated: 2026-04-18
Epics
AUDIT-EPIC-01 — Tamper-Evident Event Ingestion
| Field | Value |
|---|---|
| Issue type | Epic |
| Summary | Ingest all platform events into immutable, chain-hashed audit store |
| Status | To Do |
| Priority | Must |
| Labels | service:audit-service, domain:audit, slice:S0 |
| Components | audit-service, NATS JetStream, PostgreSQL |
| Fix version | M0 |
| FR references | FR-AUDIT-001, FR-AUDIT-002, FR-AUDIT-003 |
| Legacy FR refs | PLAT-AUDIT-001..003 (from _sources/audit/) |
| Dependencies | cross-service: IDENT-EPIC-01, TENANT-EPIC-01 |
| Rollup status | Not started |
Business outcome: Every compliance-relevant platform event is permanently recorded in a tamper-evident store, satisfying HIPAA analogue, GDPR audit, and Afghanistan MoPH audit obligations from the day the platform goes live.
Description:
Implement the wildcard NATS JetStream consumer that subscribes to all platform event streams. Each received event is normalised into an AuditEntry, deduplicated on source_event_id, and stored with a SHA-256 chain hash linking it to the previous entry. The audit_app Postgres role enforces INSERT-only access at the DB engine level — no application path can UPDATE or DELETE an entry. Monthly partitioning manages table growth. Success criteria: all source service events appear in audit_entries within 200 ms P95 of emission.
Stories: AUDIT-US-001, AUDIT-US-002, AUDIT-US-003, AUDIT-US-004
AUDIT-EPIC-02 — Compliance Query and Disclosure API
| Field | Value |
|---|---|
| Issue type | Epic |
| Summary | Role-scoped audit query API for compliance officers and tenant admins |
| Status | To Do |
| Priority | Must |
| Labels | service:audit-service, domain:audit, slice:S0 |
| Components | audit-service, Kong |
| Fix version | M0 |
| FR references | FR-AUDIT-004, FR-AUDIT-005 |
| Legacy FR refs | PLAT-AUDIT-004..005 |
| Dependencies | AUDIT-EPIC-01, cross-service: IDENT-EPIC-01 |
| Rollup status | Not started |
Business outcome: Compliance officers and tenant admins can query the audit trail in real time for investigations, access reviews, and regulatory spot-checks, with tenant isolation enforced automatically.
Description:
Implement GET /api/v1/audit/entries with multi-field filtering (actor, event type, resource, date range) and cursor pagination. GET /api/v1/audit/entries/:id for single-record retrieval. GET /api/v1/audit/disclosures scoped to a specific patient for HIPAA accounting-of-disclosures. Live queries are limited to 90-day windows. Tenant Admins auto-scoped to their tenant by JWT claim; Super Admins see all. Success criteria: query returns correct tenant-scoped results; cross-tenant query returns empty/403.
Stories: AUDIT-US-005, AUDIT-US-006, AUDIT-US-007
AUDIT-EPIC-03 — Async Export (NDJSON / CSV)
| Field | Value |
|---|---|
| Issue type | Epic |
| Summary | Async export of audit entries with signed download URL |
| Status | To Do |
| Priority | Must |
| Labels | service:audit-service, domain:audit, slice:S1 |
| Components | audit-service, object-storage |
| Fix version | M1 |
| FR references | FR-AUDIT-006, FR-AUDIT-007 |
| Legacy FR refs | PLAT-AUDIT-006 |
| Dependencies | AUDIT-EPIC-01, AUDIT-EPIC-02 |
| Rollup status | Not started |
Business outcome: Regulators (MoPH) and compliance teams can export large audit datasets for offline analysis, forensic investigation, and national health information system reporting without impacting live query performance.
Description:
Implement POST /api/v1/audit/exports to queue an AuditExport job (Super Admin only). Background worker streams matching rows to NDJSON or CSV file in object storage. On completion, a signed download URL (1-hour TTL) is set on the export record. The export request itself creates an AuditEntry (meta-audit). GET /api/v1/audit/exports/:id polls job status. Success criteria: 1 M row export completes in < 10 min P95; file is accessible via signed URL.
Stories: AUDIT-US-008, AUDIT-US-009
AUDIT-EPIC-04 — Chain Integrity Verification
| Field | Value |
|---|---|
| Issue type | Epic |
| Summary | Scheduled chain-hash verification job detects any tampering |
| Status | To Do |
| Priority | Must |
| Labels | service:audit-service, domain:audit, slice:S1 |
| Components | audit-service |
| Fix version | M1 |
| FR references | FR-AUDIT-008 |
| Legacy FR refs | PLAT-AUDIT-007 |
| Dependencies | AUDIT-EPIC-01 |
| Rollup status | Not started |
Business outcome: Any tampering with the audit store — whether by an insider or a compromised process — is detected within 24 hours, satisfying the tamper-evident guarantee that underpins the platform's compliance posture.
Description:
Implement a scheduled job (cron: 0 2 * * *) that reads audit_entries in recorded_at order and recomputes the SHA-256 chain hash for each row. On any mismatch, audit_chain_integrity_failures_total counter is incremented and AuditChainIntegrityFailed alert fires. The job must complete within 5 minutes for 1 M rows (partition-pruned to configurable window). Success criteria: tampered row detected in < 24 hours; verification job produces Prometheus metric on every run.
Stories: AUDIT-US-010, AUDIT-US-011
AUDIT-EPIC-05 — DLQ Handler and Event Quality
| Field | Value |
|---|---|
| Issue type | Epic |
| Summary | Dead-letter queue handling and source event quality alerting |
| Status | To Do |
| Priority | Should |
| Labels | service:audit-service, domain:audit, slice:S1 |
| Components | audit-service, platform-admin-service |
| Fix version | M1 |
| FR references | FR-AUDIT-009 |
| Legacy FR refs | PLAT-AUDIT-008 |
| Dependencies | AUDIT-EPIC-01, cross-service: PLTADM-EPIC-01 |
| Rollup status | Not started |
Business outcome: Events that cannot be ingested (malformed schema, max retries exceeded) are isolated and surfaced to the platform operations team rather than silently dropped, ensuring the audit trail is complete or a known gap is documented.
Description:
Implement the DLQ NATS consumer for audit.dlq subject. On message arrival, attempt schema coercion 3 times. On exhaustion, store the raw payload with normalisation_error=true flag in a audit_dlq_entries table and emit audit.dlq.alert.v1 to platform-admin-service. audit_dlq_pending_messages metric and AuditDLQGrowing alert notify SRE. DLQ entries are inspectable via an admin API. Success criteria: malformed event reaches DLQ table within 30 s; alert fires within 2 min.
Stories: AUDIT-US-012, AUDIT-US-013
AUDIT-EPIC-06 — Audit Service Observability and SLOs
| Field | Value |
|---|---|
| Issue type | Epic |
| Summary | Full OTEL instrumentation, Prometheus metrics, and SLO dashboards |
| Status | To Do |
| Priority | Must |
| Labels | service:audit-service, domain:audit, slice:S0 |
| Components | audit-service, observability |
| Fix version | M0 |
| FR references | FR-AUDIT-010 |
| Legacy FR refs | — |
| Dependencies | AUDIT-EPIC-01 |
| Rollup status | Not started |
Business outcome: SRE can detect ingestion failures, query performance degradation, and chain-integrity issues within minutes, enabling rapid response before compliance obligations are breached.
Description:
Instrument every ingestion, query, and export operation with OTEL spans. Publish audit_events_ingested_total, audit_ingestion_duration_ms, audit_chain_integrity_failures_total, and all other metrics in OBSERVABILITY.md. Deploy Grafana dashboards "Audit Service — Ingestion", "Chain Integrity", and "SLO Burn". Configure all 6 alerts (OBSERVABILITY.md §6) in Alertmanager with linked runbooks. Success criteria: all dashboards live in staging; AuditIngestionStopped alert fires within 5 min of simulated NATS failure.
Stories: AUDIT-US-014, AUDIT-US-015