Notification Service — Jira Epics & User Stories
Status: populated Owner: Product + Platform Engineering Last updated: 2026-04-18
Epic EP-NOTIF-01: Event Consumption & Routing
Goal: Consume platform lifecycle events from NATS and route to the appropriate notification handler.
| Story ID | Title | Acceptance Criteria | Points |
|---|---|---|---|
| US-NOTIF-001 | Consume auth.user.registered.v1 events | Given a auth.user.registered.v1 event on auth.events, when consumed, then SendUserRegisteredNotificationUseCase is invoked and NATS message ACKed | 2 |
| US-NOTIF-002 | Consume billing.invoice.generated.v1 events | Given a billing.invoice.generated.v1 event on billing.events, when consumed, then SendInvoiceGeneratedNotificationUseCase is invoked | 2 |
| US-NOTIF-003 | Consume operator.health.status_changed.v1 (DOWN) | Given an operator status change to DOWN, when consumed, then SendOperatorDownNotificationUseCase is invoked | 2 |
| US-NOTIF-004 | Consume operator.health.status_changed.v1 (UP) | Given an operator recovery event (newStatus=UP), when consumed, then SendOperatorRecoveredNotificationUseCase is invoked | 1 |
| US-NOTIF-005 | Consume system.alerts.raised.v1 events | Given a system alert event, when consumed, then SendSystemAlertNotificationUseCase is invoked with severity routing | 2 |
| US-NOTIF-006 | Deduplicate NATS redeliveries | Given a NATS event with a sourceEventId already in notification_log with status=SENT, when redelivered, then the notification is SUPPRESSED and NATS ACKed | 3 |
| US-NOTIF-007 | Per-subject durable consumers | Given 4 NATS subjects, then each has its own independently-manageable durable consumer with a named consumer group | 2 |
| US-NOTIF-008 | Per-consumer feature flag | Given NOTIF_CONSUMERS_ENABLED env vars per subject, then individual consumers can be enabled/disabled without restart | 2 |
Epic EP-NOTIF-02: Preference Management & Suppression
Goal: Respect per-account notification opt-outs while ensuring mandatory notifications are always delivered.
| Story ID | Title | Acceptance Criteria | Points |
|---|---|---|---|
| US-NOTIF-009 | Preference lookup before dispatch | Given an account with optedOut=true for OPERATOR_ALERT × EMAIL, when an OPERATOR_DOWN notification is dispatched, then status=SUPPRESSED is logged and no email is sent | 2 |
| US-NOTIF-010 | SYSTEM_SECURITY ignores opt-out | Given an account with optedOut=true for SYSTEM_SECURITY × EMAIL, when a CRITICAL system alert fires, then email is delivered regardless (opt-out bypassed) | 3 |
| US-NOTIF-011 | BILLING and ACCOUNT categories non-optional | Given any opt-out for BILLING or ACCOUNT category, then invoice and registration emails are always sent | 2 |
| US-NOTIF-012 | Default opted-in behavior | Given a new account with no notification_preferences rows, then all notifications are dispatched (opted-in by default) | 1 |
| US-NOTIF-013 | Update preference via internal endpoint | Given a PUT /internal/preferences with valid body from admin-dashboard, then the preference is upserted and takes effect on next dispatch | 2 |
| US-NOTIF-014 | List preferences for account | Given GET /internal/preferences?accountId=..., then all category × channel preferences for that account are returned | 1 |
Epic EP-NOTIF-03: Template Engine
Goal: Provide a DB-backed, versioned template system with Handlebars + Mjml rendering and admin management.
| Story ID | Title | Acceptance Criteria | Points |
|---|---|---|---|
| US-NOTIF-015 | Render Handlebars + Mjml email template | Given a template with Mjml body and Handlebars variables, when rendered with a valid variable map, then valid HTML is returned with all variables substituted | 3 |
| US-NOTIF-016 | Plain-text fallback on Mjml error | Given a template where Mjml compilation fails, when rendering, then bodyText is returned and a NotifTemplateRenderError metric is incremented | 2 |
| US-NOTIF-017 | Variables schema validation at save time | Given a template with variablesSchema defining required fields, when saving with POST /internal/templates, then the schema is validated and 400 returned for invalid JSON Schema | 2 |
| US-NOTIF-018 | Variables schema validation at render time | Given a render call missing a required variable, then TemplateRenderError is thrown, delivery is FAILED, and the error is logged | 2 |
| US-NOTIF-019 | Unique active template per type × channel | Given an existing active template for INVOICE_GENERATED × EMAIL, when creating a second active template for the same type × channel, then 409 conflict is returned | 2 |
| US-NOTIF-020 | Versioned template update | Given a PATCH /internal/templates/{id}, then version increments, is_active remains true, and the old version is readable via the notification_log reference | 2 |
| US-NOTIF-021 | Template preview endpoint | Given POST /internal/templates/{id}/preview with sample variables, then rendered HTML and plain text are returned without triggering a delivery | 2 |
| US-NOTIF-022 | List and get templates | Given GET /internal/templates and GET /internal/templates/{id}, then active templates are returned with full bodyHtml, bodyText, and variablesSchema | 1 |
Epic EP-NOTIF-04: Delivery Channels
Goal: Deliver notifications via email (SendGrid) and SMS (Ghasi platform) with retry and audit.
| Story ID | Title | Acceptance Criteria | Points |
|---|---|---|---|
| US-NOTIF-023 | Email delivery via SendGrid | Given a rendered email notification, when delivered, then SendGrid POST /v3/mail/send is called with correct from, to, subject, html, text; providerMessageId stored in log | 3 |
| US-NOTIF-024 | SMS delivery via sms-orchestrator | Given an SMS notification, when delivered, then POST /v1/sms/send is called with metadata.priority=low and notificationId; messageId stored as providerMessageId | 3 |
| US-NOTIF-025 | Retry on transient delivery failure (3 attempts) | Given a transient SendGrid 5xx, when retrying with backoff (5s, 30s, 2min), then up to 3 attempts are made before logging FAILED | 3 |
| US-NOTIF-026 | No NATS NAK on permanent delivery failure | Given all 3 retry attempts fail with a permanent error (e.g. invalid email), then the NATS message is ACKed, status=FAILED logged, and NotifEmailDeliveryFailed alert fires | 2 |
| US-NOTIF-027 | Dual channel for OPERATOR_DOWN | Given an OPERATOR_DOWN event and an opted-in admin, then both an EMAIL and an SMS notification are dispatched; both logged independently | 2 |
| US-NOTIF-028 | CRITICAL alert dual channel | Given a system.alerts.raised.v1 with severity=CRITICAL, then EMAIL + SMS dispatched to all platform.admin users, ignoring opt-out | 2 |
| US-NOTIF-029 | Platform admin recipient resolution | Given a platform alert, then auth-service is queried for all platform.admin users; result cached 5 min in Redis | 2 |
Epic EP-NOTIF-05: Notification Audit Log
Goal: Every delivery attempt is traceable, including suppressions and failures.
| Story ID | Title | Acceptance Criteria | Points |
|---|---|---|---|
| US-NOTIF-030 | Log every delivery attempt | Given any dispatch path (SENT, FAILED, or SUPPRESSED), then a notification_log row is created with status, channel, category, sourceEventType, sourceEventId, attemptCount | 2 |
| US-NOTIF-031 | PII masking in log | Given a delivery to user@example.com, then notification_log.recipient_address stores ***@example.com | 2 |
| US-NOTIF-032 | Query notification log via admin endpoint | Given GET /internal/notifications?accountId=...&status=FAILED, then filtered log entries are returned in reverse chronological order | 2 |
| US-NOTIF-033 | 90-day log retention | Given notification_log rows older than 90 days, then they are dropped with the monthly partition; financial data not affected | 1 |
Epic EP-NOTIF-06: Observability & Reliability
| Story ID | Title | Acceptance Criteria | Points |
|---|---|---|---|
| US-NOTIF-034 | Prometheus metrics endpoint | Given /metrics, then all notification metric families exposed in Prometheus format | 2 |
| US-NOTIF-035 | NotifEmailDeliveryFailed alert | Given email FAILED rate > 5% for 5 min, then alert fires | 1 |
| US-NOTIF-036 | NotifSystemAlertFailed alert | Given a CRITICAL system alert with status=FAILED, then alert fires immediately | 2 |
| US-NOTIF-037 | NotifNatsLag alert | Given consumer lag > 5000 on any subject, then alert fires | 1 |
| US-NOTIF-038 | Readiness probe | Given /health/ready, then 200 only when PG, NATS, and SendGrid are reachable | 1 |
Epic EP-NOTIF-07: National Incident Broadcasts to All Platform Stakeholders
Goal: During a national-scale incident (regional outage, MNO-wide outage, regulator advisory) the platform must reach all relevant stakeholders (tenants, NOC, on-call, regulator-portal) over multiple channels in seconds.
| Story ID | Title | Acceptance Criteria | Points |
|---|---|---|---|
| US-NOTIF-039 | Incident broadcast trigger from NOC | POST /v1/internal/notifications/broadcast accepts severity (CRITICAL/HIGH/MEDIUM), audience (all-tenants, by-tier, by-affected-mno, by-region), channels (in-portal, email, SMS, webhook), and pre-approved templateId; mTLS-only, NOC role required | 5 |
| US-NOTIF-040 | Multi-channel fan-out within 60 s | Fan-out engine writes to in-portal, email, SMS via channel-router, and tenant webhooks within 60 s P95; per-channel delivery status tracked in notif.broadcast_deliveries | 5 |
| US-NOTIF-041 | Tenant unsubscribe protections | Tenants cannot unsubscribe from CRITICAL severity (regulator/safety); HIGH allows opt-out only with explicit signed acknowledgement; MEDIUM is fully opt-out-able per channel | 3 |
| US-NOTIF-042 | Broadcast reach reporting and audit | After broadcast: notif.broadcast.report.v1 emitted with reached/failed/opted-out counts per channel; archived 7 y in notif.broadcast_audit | 3 |