Skip to main content

Ghasi SMS Gateway — Product & Engineering Roadmap

Version: 1.1 Status: Normative Aligned with: architecture_baseline.md, infrastructure_baseline.md, governance.md, all service specs, ADR-0002 (Keycloak multi-IDP), ADR-0003 (Compliance Layer) Principles: Cloud-first, Multi-tenant SaaS, Clean Architecture + DDD, Event-driven microservices, Security + compliance from day one, Fast time-to-market, Iterative delivery in slices.

Change log

  • v1.1 (2026-04-19) — Rebaselined for two architectural decisions landed in v1.2 of the enterprise architecture: (a) Keycloak as base/default IdP with a pluggable provider abstraction enabling tenant external OIDC/SAML SSO (ADR-0002); (b) Compliance Layer as first-class tier via the compliance-engine microservice (ADR-0003). Added 12 new epics (2 identity, 10 compliance) and 50 new user stories; totals updated accordingly. Resequenced M0 and M1 to ensure compliance evaluation is on the path from the first production-bound SMS.
  • v1.0 (2026-04-12) — Initial release with 13 services, Firebase identity, no compliance tier.

0. Executive Summary

The Ghasi SMS Gateway is a telecom-grade, multi-tenant SMS aggregation platform. It connects enterprise customers to mobile network operators (MNOs) via SMPP 3.4, providing intelligent routing, delivery tracking, billing, self-service management, and — from v1.1 — a first-class Compliance Layer that gates every outbound message before carrier dispatch.

This roadmap delivers the full 14-service platform in 6 milestones across 18 sprints (2-week cadence, ~9 months total). Each milestone produces a deployable, testable, independently valuable increment.

Total scope: 79 epics, ~243 user stories, 14 microservices, 2 frontend apps, Keycloak identity platform, local-LLM compliance AI.


1. Milestone Overview

MilestoneThemeDurationSprintsCustomersPrimary ValueMonetizationCompetitive Edge
M0Platform Foundation4 weeks1–2Internal / DevOpsInfrastructure, Keycloak + IdP provider abstraction, API skeleton, compliance-engine scaffolding, CI/CDNoneArchitecture locked; security + compliance from day one
M1First Marketable Product6 weeks3–5Alpha testers (internal)Send SMS end-to-end: API → compliance (observation-mode) → routing → SMPP → DLRNone (free alpha)Compliance Layer on the path from day one; functional parity with basic SMS APIs
M2First Sellable Product6 weeks6–8Beta customers (5–10)Self-service portal, billing, webhooks, compliance rule + hold-queue management, compliance billing waiverPer-SMS pricing + monthly invoicesFull self-service onboarding; webhook DLR delivery; enforced compliance rules
M3Competitive Differentiator6 weeks9–11Paid customers (20–50)Admin dashboard, advanced routing, operator failover, analytics, tenant scoring & risk tiering, tenant external OIDC/SAML SSOTiered pricing, seat-pack billing, enterprise SSO upsellIntelligent routing; real-time operator failover; admin visibility; enterprise SSO; auto-enforced tenant risk tiers
M4Full Platform GA6 weeks12–14GA launch (100+)Notifications, full analytics, local-LLM AI classification, compliance reporting & audit, observability hardeningFull pricing tiers, enterprise contractsProduction-grade SLAs; AI-assisted compliance; regulator-ready audit exports; full observability
M5Post-GA Expansion6 weeks15–18Scale-upMulti-currency, hot-reload config, advanced analytics, API v2, Firebase legacy retirement, classification accuracy harnessMulti-currency billing, premium tiersHot-reload ops; real-time analytics; API evolution; legacy IdP sunset

Total: ~36 weeks (9 months) to GA. Post-GA runs continuously.


2. Slicing Strategy

Slice 1 / M0 — Platform Foundation

Capabilities: Kubernetes namespaces (incl. ghasi-identity), database schemas (incl. compliance + keycloak), NATS JetStream streams (incl. compliance.* + auth.events.idp.*), Redis cluster, Keycloak HA deployment with dev+staging realms, IdentityProvider port scaffolded in auth-service with Keycloak provider as default, compliance-engine scaffolded (gRPC + HTTP + schema + mTLS + health), Kong edge gateway, health probes, CI/CD pipeline.

Services:

  • auth-service — AUTH-EPIC-001 (Keycloak baseline), AUTH-EPIC-002 (IdP provider abstraction), AUTH-EPIC-003 (RBAC), AUTH-EPIC-004 (API key lifecycle)
  • compliance-engine — EP-CE-01 (Service Foundation: scaffold, schema, mTLS, health)
  • Kong — edge gateway with JWT + ghasi-api-key-lookup plugins (ADR-0001)
  • Infrastructure bootstrap (shared modules, Prisma migrations, Docker Compose, K8s manifests)

Frontend: None (API-only; Postman/curl testing).

Why this slice matters: Nothing else works without identity, authorization, and infrastructure. Freezing the auth contract and the compliance-engine contract early prevents cascading changes. Every downstream service depends on the platform JWT and (from M1) the compliance gRPC. Deploying Keycloak in M0 avoids a disruptive IdP swap later; scaffolding compliance-engine in M0 ensures its gRPC is available the moment M1 starts publishing SMS.

Monetization: None. Investment phase.

Epics:

EpicServiceStories
AUTH-EPIC-001auth-serviceKeycloak baseline: realm per env, client config, JWKS exposure, platform JWT issuance — AUTH-US-001, 002, 003, 004
AUTH-EPIC-002auth-serviceIdP provider abstraction: IdentityProvider port, KeycloakProvider, NativeProvider, FirebaseLegacyProvider adapters, dispatcher — AUTH-US-005, 006, 007
AUTH-EPIC-003auth-serviceRBAC: roles, scopes, user-role assignment — AUTH-US-008, 009, 010
AUTH-EPIC-004auth-serviceAPI key lifecycle + Kong custom plugin lookup — AUTH-US-011, 012, 013
GW-EPIC-001Kong (api-gateway)JWT + key-auth + rate-limit + correlation-id + OTel — GW-US-001, 002, 003
GW-EPIC-007Kong (api-gateway)Observability skeleton — GW-US-013, 014, 015
EP-CE-01compliance-engineService Foundation: NestJS gRPC + HTTP bootstrap, Prisma migrations for full compliance schema, mTLS for gRPC, health/metrics endpoints — US-CE-001, 002, 003, 004

Story count: ~26 | Sprint allocation: Sprint 1–2

Architectural freeze by end of M0: platform JWT claim shape (incl. idp claim), IdentityProvider port, compliance schema, ComplianceService.v1 proto, Kong plugin policies, Keycloak realm-per-environment convention.


Slice 2 / M1 — Core SMS Flow (First Marketable Product)

Capabilities: Send SMS via REST API, intelligent prefix-based routing, SMPP 3.4 delivery to operators, delivery receipt processing, message status tracking, idempotent processing, retry + DLQ. Compliance Layer in observation mode — every outbound SMS traverses compliance-engine via gRPC, but the seeded rule set is FLAG-only so no messages are actually held/blocked. This validates the pipeline, fail-closed semantics, and latency budget before enforcement is turned on in M2.

Services:

  • Kong (api-gateway config) — GW-EPIC-002 (SMS send + idempotency), GW-EPIC-003 (message status)
  • sms-orchestrator — ORCH-EPIC-001 through 004 (full processing core), ORCH-EPIC-005 (compliance integration)
  • routing-engine — ROUTE-EPIC-001 (operator selection), ROUTE-EPIC-002 (health monitoring)
  • smpp-connector — SMPP-EPIC-001 (session mgmt), SMPP-EPIC-002 (PDU processing), SMPP-EPIC-003 (TPS + failover)
  • dlr-processor — DLR-EPIC-001 (core DLR pipeline)
  • compliance-engineEP-CE-02 core subset (KEYWORD, SENDER_ID, GEO_RESTRICTION, full EvaluateCompliance handler), EP-CE-08 (async pipeline integration + observation-mode rollout), EP-CE-07 partial (metrics, structured logging, OTel)

Frontend: None (API-only; test via curl/Postman/SDK).

Why this slice matters: This is the critical path to a working product. An SMS sent via the API reaches a handset and a DLR comes back. Everything else (billing, portal, analytics) is layered on top of this flow. Shipping the Compliance Layer on the path from day one — in observation mode — avoids a disruptive insertion later and validates fail-closed behaviour with real traffic under low stakes.

Monetization: Free alpha tier to validate throughput, latency, operator connectivity, and compliance-evaluation budget.

Epics:

EpicServiceStories
GW-EPIC-002api-gateway (Kong)GW-US-004, 005, 006
GW-EPIC-003api-gateway (Kong)GW-US-007
ORCH-EPIC-001sms-orchestratorORCH-US-001–007, 012
ORCH-EPIC-002sms-orchestratorORCH-US-008, 009
ORCH-EPIC-003sms-orchestratorORCH-US-010, 011
ORCH-EPIC-004sms-orchestratorORCH-US-013, 014, 015
ORCH-EPIC-005sms-orchestratorCompliance integration: EVALUATING state, gRPC client, verdict handler, fail-closed non-ack — ORCH-US-016, 017, 018
ROUTE-EPIC-001routing-engineROUTE-US-001, 002, 003
ROUTE-EPIC-002routing-engineROUTE-US-007, 008, 009, 010
SMPP-EPIC-001smpp-connectorSMPP-US-001, 002, 012
SMPP-EPIC-002smpp-connectorSMPP-US-003, 004, 005
SMPP-EPIC-003smpp-connectorSMPP-US-006, 007, 008
SMPP-EPIC-004smpp-connectorSMPP-US-009, 010, 011
DLR-EPIC-001dlr-processorDLR-US-001–006, 009, 010
EP-CE-02 (subset)compliance-engineRule engine core — KEYWORD, SENDER_ID, GEO, full EvaluateCompliance handler — US-CE-005, 010, 011, 013
EP-CE-08compliance-engine + sms-orchestratorAsync pipeline integration + observation-mode rollout — US-CE-033, 034, 035, 037
EP-CE-07 (subset)compliance-engineMetrics, structured logging with PII masking, OTel — US-CE-028, 029, 030

Story count: ~60 | Sprint allocation: Sprint 3–5

M1 exit criterion (new): compliance.audit.v1 is emitted for every sent SMS in the alpha environment; P95 EvaluateCompliance ≤ 500 ms under 200 RPS; fail-closed chaos test passes (kill compliance-engine → messages stay in EVALUATING and never hit a carrier).


Slice 3 / M2 — First Sellable Product

Capabilities: Customer self-service portal (signup, API keys, test SMS, message logs, webhook management, billing dashboard), billing event pipeline, pricing engine, invoice generation, webhook delivery with HMAC signing and retry. Compliance Layer switched from observation to enforcement: real rule authoring, blocklists, keyword lists, and a working hold-queue with manual review in admin-dashboard. Billing consumes compliance events so non-dispatched messages are not charged.

Services:

  • billing-service — BILL-EPIC-001 through 004, BILL-EPIC-005 (compliance waiver)
  • customer-portal — CUST-EPIC-001 through 006, CUST-EPIC-007 (compliance visibility: blocked/held states, appeals)
  • webhook-dispatcher — HOOK-EPIC-001 through 004
  • dlr-processor — DLR-EPIC-002 (billing emission), DLR-EPIC-003 (webhook trigger)
  • Kong (api-gateway config) — GW-EPIC-004 (API key mgmt), GW-EPIC-005 (billing proxy), GW-EPIC-006 (webhook test)
  • compliance-engineEP-CE-03 (Hold Queue & Manual Review), EP-CE-04 (Rule & Blocklist Management API), EP-CE-02 remainder (REGEX, RECIPIENT, RATE_VOLUME, TEMPORAL, COMPOSITE rule types)
  • notification-service (early slice) — consumes compliance.message.* events to surface portal notifications (NOTIF-US-012, 013 brought forward)

Frontend: customer-portal (Next.js, port 3002). Compliance views (blocked/held messages, appeal form) per EP-CE-10 subset (US-CE-042, 043, 045).

Why this slice matters: This is the earliest path to monetization. Customers can self-onboard, send production SMS, track delivery, receive webhook callbacks, and get billed. Revenue starts here. Turning compliance enforcement on at M2 — before we scale paid customers — contains blast radius if rules need tuning and establishes the evidence trail from first dollar.

Monetization: Per-segment pricing. Monthly invoiced billing. Self-service signup eliminates sales friction for SMB segment. Blocked / held / rejected / expired messages are not billed (EP-CE-08 / US-CE-036 ensures this is reconcilable end-to-end).

Epics:

EpicServiceStories
BILL-EPIC-001billing-serviceBILL-US-001, 002, 003, 004
BILL-EPIC-002billing-serviceBILL-US-005, 006, 007
BILL-EPIC-003billing-serviceBILL-US-008, 009, 010
BILL-EPIC-004billing-serviceBILL-US-011, 012, 013, 014, 015
CUST-EPIC-001customer-portalCUST-US-001–004, 020, 021
CUST-EPIC-002customer-portalCUST-US-005, 006, 007
CUST-EPIC-003customer-portalCUST-US-008, 009
CUST-EPIC-004customer-portalCUST-US-010, 011, 012, 013
CUST-EPIC-005customer-portalCUST-US-014, 015, 016, 017
CUST-EPIC-006customer-portalCUST-US-018, 019
HOOK-EPIC-001webhook-dispatcherHOOK-US-001, 002, 006, 007, 014
HOOK-EPIC-002webhook-dispatcherHOOK-US-003, 004, 005, 011
HOOK-EPIC-003webhook-dispatcherHOOK-US-008, 009, 012, 013
HOOK-EPIC-004webhook-dispatcherHOOK-US-010
DLR-EPIC-002dlr-processorDLR-US-007
DLR-EPIC-003dlr-processorDLR-US-008
GW-EPIC-004api-gateway (Kong)GW-US-008, 009
GW-EPIC-005api-gateway (Kong)GW-US-010, 011
GW-EPIC-006api-gateway (Kong)GW-US-012
EP-CE-02 (remainder)compliance-engineREGEX, RECIPIENT, RATE_VOLUME, TEMPORAL, COMPOSITE — US-CE-006, 008, 012 + rule types split from US-CE-011
EP-CE-03compliance-engineHold queue, single-item review, auto-expiry, bulk-review — US-CE-014, 015, 016, 017, 018
EP-CE-04compliance-engineRule CRUD + versioning, rule-set mgmt, blocklist mgmt, keyword-list mgmt — US-CE-019, 020, 021, 022
BILL-EPIC-005billing-serviceCompliance event consumer — waive non-dispatched messages — US-CE-036
CUST-EPIC-007customer-portalCompliance message states, appeals UI — US-CE-042, 045
NOTIF-EPIC (early)notification-serviceHold/block portal alerts — US-CE-043

Story count: ~80 | Sprint allocation: Sprint 6–8

M2 exit criterion (new): at least one production tenant has had a message correctly HELD, manually released via admin-dashboard, billed on eventual DLR, and portal-notified of each state change. Compliance-audit export successfully round-trips to CSV for a 7-day window.


Slice 4 / M3 — Competitive Differentiator

Capabilities: Admin dashboard (operator CRUD, routing rules, user/role management, message logs, billing overview, system health, compliance rule authoring, hold-queue review UI, tenant scoring dashboards), advanced routing (per-account, cost-based, round-robin, priority), operator management with Vault credential storage, analytics pipeline. Tenant external OIDC/SAML SSO via Keycloak broker (the enterprise-unlock capability). Continuous tenant compliance scoring + risk tier enforcement.

Services:

  • admin-dashboard — ADMDASH-EPIC-001 through 007, ADMDASH-EPIC-008 (compliance console: rules, hold queue, tenant scores, audit log viewer)
  • operator-management-service — OPS-EPIC-001 through 004
  • routing-engine — ROUTE-EPIC-003 (rules management), ROUTE-EPIC-004 (observability)
  • smpp-connector — SMPP-EPIC-005 (observability)
  • analytics-service — ANLYT-EPIC-001, ANLYT-EPIC-002, ANLYT-EPIC-005 (compliance analytics: audit stream archive, violations, tier transitions)
  • auth-serviceAUTH-EPIC-005 (tenant external OIDC SSO), AUTH-EPIC-006 (tenant external SAML 2.0 SSO), AUTH-EPIC-007 (SCIM 2.0 inbound provisioning)
  • compliance-engineEP-CE-05 (Tenant Scoring & Risk Tiering), EP-CE-07 remainder (alerts, runbooks, HPA, deployment hardening), EP-CE-10 remainder (US-CE-044 tenant score visibility in portal)
  • notification-serviceNOTIF-EPIC-001-early slice extended for compliance alert routing + email delivery prefs

Frontend: admin-dashboard (Next.js, port 3001) including the compliance console. Customer-portal surfaces tenant score + tier guidance copy.

Why this slice matters: This is where the platform becomes operationally competitive and enterprise-sellable. Admin visibility, intelligent routing, operator failover driven by health events, analytics, enterprise SSO against the customer's own IdP, and automated tenant risk tiering are the four things enterprise customers will ask about in pre-sales. Together they close the enterprise gap vs commodity SMS APIs.

Monetization: Enterprise tier pricing enabled by admin tooling + SSO. Tiered routing creates upsell opportunity. Tenant risk tiering reduces support cost by auto-enforcing volume limits on risky tenants.

Epics:

EpicServiceStories
ADMDASH-EPIC-001admin-dashboardADMDASH-US-001, 002
ADMDASH-EPIC-002admin-dashboardADMDASH-US-003, 004
ADMDASH-EPIC-003admin-dashboardADMDASH-US-005–008, 018, 019
ADMDASH-EPIC-004admin-dashboardADMDASH-US-013, 014, 015
ADMDASH-EPIC-005admin-dashboardADMDASH-US-009, 010, 017
ADMDASH-EPIC-006admin-dashboardADMDASH-US-011, 012, 020
ADMDASH-EPIC-007admin-dashboardADMDASH-US-016
OPS-EPIC-001operator-mgmtOPS-US-001–005
OPS-EPIC-002operator-mgmtOPS-US-006, 007
OPS-EPIC-003operator-mgmtOPS-US-008, 009
OPS-EPIC-004operator-mgmtOPS-US-010, 011, 012
ROUTE-EPIC-003routing-engineROUTE-US-004, 005, 006, 011, 012
ROUTE-EPIC-004routing-engineROUTE-US-013, 014
SMPP-EPIC-005smpp-connectorSMPP-US-013, 014
ANLYT-EPIC-001analytics-serviceANLYT-US-001, 002, 003
ANLYT-EPIC-002analytics-serviceANLYT-US-004, 005, 006
ANLYT-EPIC-005analytics-serviceCompliance analytics — audit archive, violations dashboards, tier-transition reports
ADMDASH-EPIC-008admin-dashboardCompliance console — rule authoring, hold-queue review UI, tenant scores + overrides, audit log viewer
AUTH-EPIC-005auth-serviceTenant external OIDC SSO (brokered via Keycloak) — discovery URL registration, mapper provisioning, SSO start/callback, external_identities linking
AUTH-EPIC-006auth-serviceTenant external SAML 2.0 SSO (brokered via Keycloak) — metadata intake, SP endpoints, ACS/SLS
AUTH-EPIC-007auth-serviceSCIM 2.0 inbound — Users + Groups CRUD, per-tenant bearer tokens, Keycloak mirror
EP-CE-05compliance-engineTenant scoring worker, REST endpoints, manual tier override — US-CE-023, 024, 025
EP-CE-07 (remainder)compliance-engineAlerts + runbook, K8s deployment with HPA + PDB — US-CE-031, 032
EP-CE-10 (score visibility)compliance-engine + customer-portalTenant-visible score + tier + guidance — US-CE-044

Story count: ~80 | Sprint allocation: Sprint 9–11

M3 exit criterion: one enterprise tenant onboarded via Azure AD OIDC or Okta SAML in staging; tenant compliance score + tier displayed in both admin and tenant portals; automated SUSPENDED-tier → auto-HOLD enforcement validated end-to-end.


Slice 5 / M4 — Full Platform GA

Capabilities: Notification service (welcome emails, invoice emails, operator alerts, system alerts), full analytics API with caching, DLR observability hardening, local-LLM AI classification for compliance (AI_CLASSIFICATION + DLR_ABUSE rule types), compliance reporting & audit export (TENANT_AUDIT, VIOLATION_SUMMARY, etc.), GDPR erasure (auth + compliance-engine consumers), data retention policies, production observability (dashboards, runbooks, SLOs), a11y (keyboard nav, dark mode).

Services:

  • notification-service — NOTIF-EPIC-001 through 003 (including compliance-event consumption beyond the early slice)
  • analytics-service — ANLYT-EPIC-003 (query API), ANLYT-EPIC-004 (retention + observability)
  • dlr-processor — DLR-EPIC-004 (observability)
  • compliance-engineEP-CE-02 finish (AI_CLASSIFICATION, DLR_ABUSE rule types — US-CE-007, 009), EP-CE-06 (Reporting & Audit — US-CE-026, 027), EP-CE-09 (Local LLM Platform — US-CE-038, 039, 040)
  • Platform-wide — security hardening, Vault integration, mTLS, final K8s HPA tuning, GDPR erasure end-to-end (incl. compliance hold-queue PII redaction on auth.user.erased.v1)

Frontend: Admin dashboard + customer portal polish, a11y stories (ADMDASH-US-021, 022). Admin-dashboard compliance console gains AI-rule authoring + report generation surface.

Why this slice matters: GA readiness. Every SLO has a dashboard and alert. Every runbook is published. AI-assisted compliance closes the last detection gap (sophisticated fraud/phishing that static rules miss). Regulator-ready audit exports (13-month retention + TENANT_AUDIT report) satisfy the evidence ask from banking and telecom auditors. Notification workflows close the loop on invoicing, operator incidents, and compliance events. The platform is contractually supportable.

Monetization: Full pricing tiers. Enterprise SLA contracts. Premium support tier. Compliance tier (enhanced AI rules, longer retention, custom reports) as upsell.

Epics:

EpicServiceStories
NOTIF-EPIC-001notification-serviceNOTIF-US-001, 002, 006, 007
NOTIF-EPIC-002notification-serviceNOTIF-US-003, 004, 005, 008
NOTIF-EPIC-003notification-serviceNOTIF-US-009, 010, 011
ANLYT-EPIC-003analytics-serviceANLYT-US-007, 008, 009, 010
ANLYT-EPIC-004analytics-serviceANLYT-US-011, 012
DLR-EPIC-004dlr-processorDLR-US-011, 012
EP-CE-02 (finish)compliance-engineAI_CLASSIFICATION + DLR_ABUSE rule types — US-CE-007, 009
EP-CE-06compliance-engineCompliance report generation + audit-log query — US-CE-026, 027
EP-CE-09 (core)compliance-engine + local-LLMvLLM deployment, provider abstraction, cost/perf monitoring — US-CE-038, 039, 040
Cross-cuttingadmin-dashboardADMDASH-US-021, 022
GDPR erasureauth-service + compliance-engineauth.user.erased.v1 consumer on compliance-engine redacts hold-queue PII

Story count: ~30 | Sprint allocation: Sprint 12–14

M4 exit criterion: 500 msg/s sustained load test with compliance enforcement on, AI classification of ≥ 30% of traffic, local LLM P95 ≤ 300 ms; TENANT_AUDIT report generates for a 90-day window in ≤ 5 minutes; GDPR erasure redacts all tenant PII across auth and compliance schemas within SLA.


Slice 6 / M5 — Post-GA Expansion

Capabilities: Hot-reload SMPP operator config (zero-downtime), multi-currency billing, advanced analytics (real-time streaming), API v2 planning, additional MNO integrations, geographic expansion. Classification accuracy evaluation harness for the compliance AI. Firebase legacy provider retirement — migrate residual Firebase tenants to Keycloak and remove the FirebaseLegacyProvider adapter.

Services:

  • smpp-connector — SMPP-US-015 (hot-reload)
  • billing-service — multi-currency expansion
  • analytics-service — real-time streaming pipeline
  • compliance-engineEP-CE-09 finish (US-CE-041 — classification accuracy harness)
  • auth-serviceAUTH-EPIC-008 (Firebase legacy retirement): migrate Firebase-only users to Keycloak, remove FirebaseLegacyProvider, prune auth.external_identities rows for provider_id='firebase-legacy'
  • All services — performance tuning, capacity planning

Frontend: Enhanced dashboards, real-time analytics views, Firebase legacy deprecation banners + migration wizard in customer portal.

Why this slice matters: Operational excellence and market expansion. Hot-reload eliminates maintenance windows for operator changes. Multi-currency unlocks international markets. Real-time analytics enables premium tier pricing. Retiring Firebase simplifies the identity surface to a single provider class (Keycloak + external brokering) and cuts one operational dependency. The classification accuracy harness turns the AI into a measured and continuously-improving asset rather than a black box.

Monetization: International expansion revenue. Premium analytics tier. Reduced ops cost via hot-reload and single-IdP posture.

Story count: ~15 (new stories for expansion + compliance AI harness + IdP migration) | Sprint allocation: Sprint 15–18

M5 exit criterion: zero Firebase tenants in tenant_identity_providers with status = 'active'; classification accuracy report shows ≥ baseline F1 across all categories.


3. Critical Path Analysis

3.1 Earliest Path to a Working Product (M1)

Infrastructure → auth-service → api-gateway → sms-orchestrator → routing-engine → smpp-connector → dlr-processor
M0 M0 M0→M1 M1 M1 M1 M1

Critical chain: auth must be stable before API gateway can validate requests. Orchestrator depends on routing-engine (gRPC). SMPP connector depends on operator configs. DLR processor depends on SMPP connector publishing sms.dlr.inbound.

Duration: 10 weeks (M0 + M1).

3.2 Earliest Path to Monetization (M2)

Working SMS flow (M1) → billing-service → customer-portal → webhook-dispatcher
M2 M2 M2

Duration: 16 weeks (M0 + M1 + M2). Revenue from Sprint 8 onward.

3.3 Parallelizable Workstreams

WorkstreamCan run parallel withConstraint
customer-portal (CUST-EPIC-001–003)billing-service (BILL-EPIC-001–002)Portal can stub billing APIs initially
admin-dashboard (ADMDASH-*)operator-management-service (OPS-*)Both in M3; dashboard consumes OPS APIs
notification-service (NOTIF-*)analytics-service (ANLYT-003–004)Independent event consumers
webhook-dispatcher (HOOK-*)billing-service (BILL-003–004)Both consume DLR events independently
routing-engine advanced (ROUTE-003–004)smpp-connector observability (SMPP-005)No dependency
compliance-engine rule types (EP-CE-02)sms-orchestrator ORCH-EPIC-001–004Rule types only need a stable MessageContext; orchestrator + compliance teams parallel from Sprint 3
Keycloak operations + auth-service IdP abstractionAll other M0 workPlatform-team owns Keycloak; service-team owns auth-service; shared contract freeze at end of Sprint 2
Tenant external SSO (AUTH-EPIC-005, 006, 007)Admin dashboard compliance console (ADMDASH-EPIC-008)Both M3; independent but share the admin-dashboard frontend — feature-flag per capability
Compliance scoring worker (EP-CE-05)Compliance hold-queue bulk-review (EP-CE-03 US-018)Independent subsystems; both consume DLR stats + audit log

3.4 Architectural Freeze Points

ElementFreeze byReason
NATS JetStream stream definitions (incl. compliance.* + auth.events.idp.*)Sprint 1Every async service depends on stream names, subjects, consumer configs
PostgreSQL schema conventions (UUID PKs, tenant_id, timestamps)Sprint 1All Prisma schemas derive from this
Platform JWT claim shape (incl. idp claim) and IdentityProvider portSprint 2Every service validates tokens; any IdP change must go through the port
Keycloak realm-per-environment convention + Admin REST client configSprint 2Provisioning new tenant IdPs depends on this
ComplianceService.v1 gRPC proto + compliance schemaSprint 2sms-orchestrator integration and rule authoring depend on this
RBAC roles (incl. platform.compliance.*) + API key formatSprint 2Every service validates scopes against this
gRPC routing-engine contractSprint 3Orchestrator depends on this; changing it cascades
SMPP message ID correlation strategySprint 3DLR processing, billing, and status tracking all depend on this
Billing event schema (incl. compliance waiver)Sprint 5DLR processor, billing service, compliance-engine, analytics all produce/consume this
Webhook payload schemaSprint 5Customer integrations depend on this; breaking changes lose trust
API v1 response envelope (incl. compliance REST surface)Sprint 3Customer-facing; versioned; must not break

4. Engineering Roadmap (Detailed)

4.1 M0 — Platform Foundation (Sprint 1–2)

Capabilities delivered:

  • Kubernetes cluster with 5 namespaces (ghasi-prod, ghasi-identity, ghasi-data, ghasi-obs, ghasi-vault)
  • PostgreSQL 16 HA, Redis 7 cluster, NATS 3-node JetStream cluster (with compliance.* + auth.events.* streams)
  • Keycloak HA deployment (2 replicas) in ghasi-identity with Postgres-backed storage; realms ghasi-local, ghasi-staging provisioned with Admin-REST bootstrap
  • auth-service with IdentityProvider port + KeycloakProvider (default), NativeProvider (break-glass), stub FirebaseLegacyProvider; login via Keycloak OIDC; JWKS exposure; RBAC + API-key CRUD
  • compliance-engine scaffolded (NestJS dual-transport gRPC/HTTP, Prisma migrations for full compliance schema, mTLS on gRPC, health/metrics)
  • Kong edge gateway with jwt + rate-limiting-advanced + ghasi-api-key-lookup custom plugin
  • CI/CD pipeline: lint → test → build → deploy (GitHub Actions → K8s)
  • Shared packages: shared-types, shared-utils, shared-config, nats-client, db-client, logging, compliance-proto-client
  • Docker Compose for local dev with mock SMPP simulator, Keycloak with preloaded realm, mock-oidc (simulating a tenant IdP)

Services: Kong, auth-service, compliance-engine (scaffold), Keycloak (infra), infrastructure

Dependencies: K8s cluster, DNS, TLS certificates, Vault, object storage (for report output in later milestones)

Risks & mitigations:

RiskImpactMitigation
Keycloak operational learning curve (HA, upgrades, realm import/export)Delayed M0Spike in Sprint 1; document runbook; vendor-neutral OIDC port so Keycloak swap is possible later
compliance-engine proto churn after scaffoldingCascading changes in M1Freeze ComplianceService.v1 proto + EvaluateComplianceRequest/Response at end of Sprint 2
NATS JetStream learning curveDelayed stream configSpike in Sprint 1; document patterns
Prisma migration conflicts (especially partitioned tables in compliance)Schema driftSingle migration runner; advisory locks; partition provisioning cron tested against staging

Acceptance criteria:

  • auth-service health probe returns 200 with DB + Redis + NATS + Keycloak Admin REST ready
  • Platform JWT issued by auth-service after Keycloak OIDC login validates through Kong jwt plugin
  • API key created, listed, revoked via REST; sha256 hashing verified; Kong ghasi-api-key-lookup plugin resolves key → consumer
  • RBAC lookup returns correct role within 50ms (Redis cache hit); platform.compliance.* roles provisioned
  • compliance-engine /health/ready returns 200 with all deps up; stub EvaluateCompliance handler returns valid response over mTLS
  • CI pipeline deploys to staging on merge to main
  • Docker Compose docker compose up starts all infra + Keycloak + Kong + auth-service + compliance-engine in < 90s

Release checklist:

  • All shared packages published to private registry
  • Prisma migrations run cleanly on fresh DB (incl. monthly partitions for evaluation_log, audit_log, score_history)
  • NATS streams created with correct retention policies (incl. 13-month COMPLIANCE_AUDIT)
  • Vault configured with auth-service, compliance-engine, and Keycloak secrets; mTLS PKI engine issuing certs
  • Keycloak realm exports committed to repo as disaster-recovery artefact
  • Grafana dashboards for auth-service, Kong, Keycloak, compliance-engine deployed

4.2 M1 — First Marketable Product (Sprint 3–5)

Capabilities delivered:

  • POST /v1/sms/send with validation, rate limiting, idempotency
  • GET /v1/sms/:messageId/status for status polling
  • SMS orchestrator: consume → validate → route → publish → retry → DLQ
  • Routing engine: gRPC operator selection, longest-prefix matching, health-aware failover
  • SMPP connector: bind, submit_sm, enquire_link, TPS throttling, failover
  • DLR processor: receive deliver_sm → normalize → update status → persist receipt
  • Full message lifecycle: QUEUED → ROUTING → ROUTED → SENT → DELIVERED/FAILED

Services: api-gateway (SMS endpoints), sms-orchestrator, routing-engine, smpp-connector, dlr-processor

Frontend: None

Dependencies: M0 complete; at least 1 SMPP operator configured (test/sandbox)

Risks & mitigations:

RiskImpactMitigation
SMPP operator sandbox unavailableBlocks E2E testingShip mock SMPP simulator; test against it
gRPC contract instabilityCascading changesFreeze proto in Sprint 3; contract tests
Message loss during NATS consumer scalingData integrityDurable consumers + explicit ack; chaos test
TPS throttling race conditionsDuplicate SMSRedis Lua atomic script; integration tests

Acceptance criteria:

  • SMS sent via API arrives on test handset within 30s (happy path)
  • DLR received and message status updated to DELIVERED
  • Idempotent resend with same key returns original response
  • Rate limit returns 429 when exceeded
  • Failed SMPP send retries 3x with exponential backoff
  • After 3 failures, message routes to DLQ
  • Operator failover triggers when primary disconnects
  • P95 routing decision < 50ms
  • P95 API-to-NATS-publish < 100ms
  • P95 EvaluateCompliance < 500ms at 200 RPS
  • compliance.audit.v1 emitted for every evaluated SMS; Grafana dashboard shows audit rate ≈ orchestrator throughput
  • Kill-compliance-engine chaos test: messages stay in EVALUATING, redeliver 3x, move to sms.outbound.deadletter with reason compliance_unavailable; zero carrier dispatches
  • Observation-mode rule set active — all rules FLAG, no HOLD/BLOCK verdicts produced

Release checklist:

  • All 5 services passing health probes
  • Contract tests (orchestrator ↔ routing-engine gRPC) green
  • NATS consumer lag dashboards deployed
  • SMPP session state dashboard deployed
  • Runbooks: "SMPP operator disconnect", "NATS consumer lag spike", "DLQ depth alert"
  • Load test: 100 msg/s sustained for 10 min with < 0.1% loss

4.3 M2 — First Sellable Product (Sprint 6–8)

Capabilities delivered:

  • Customer portal: signup, login, API key management, test SMS, message logs, webhook config, billing dashboard
  • Billing service: event ingestion, pricing resolution, invoice generation, usage API
  • Webhook dispatcher: HMAC-signed delivery, retry, dead-letter, delivery logging
  • DLR → billing event emission (exactly-once)
  • DLR → webhook dispatch trigger (conditional on callbackUrl)

Services: billing-service, customer-portal, webhook-dispatcher, api-gateway (billing/webhook endpoints), dlr-processor (billing + webhook emission)

Frontend: customer-portal (Next.js)

Dependencies: M1 complete; Stripe/payment integration for invoice payment (or manual initially)

Risks & mitigations:

RiskImpactMitigation
Billing event double-countingRevenue integrityRedis exactly-once dedup key; reconciliation job
Customer portal UX frictionOnboarding drop-offIterative UX testing with 3 beta customers
Webhook endpoint unreliableCustomer trustRetry with backoff; dead-letter + admin visibility
Invoice generation race conditionDuplicate invoicesPostgreSQL advisory lock; idempotent cron

Acceptance criteria:

  • Customer signs up (Keycloak registration flow), creates API key, sends test SMS, sees DLR in message log
  • Webhook delivered with valid HMAC signature within 5s of DLR
  • Monthly invoice generated with correct segment count and pricing; blocked/held/expired messages are NOT billed
  • Webhook retry exhaustion routes to dead-letter; visible in customer portal
  • CSV export of message logs works for 10k+ records
  • Billing usage API returns correct totals matching event log
  • Compliance enforcement is on: seeded rule set has at least one BLOCK keyword and one HOLD rule; triggering each from customer-portal test SMS produces the expected terminal state and portal notification
  • Admin reviewer can RELEASE a held message from admin-dashboard and see it subsequently billed on DLR

Release checklist:

  • Customer portal deployed behind Keycloak OIDC SSO (default realm); no Firebase dependency in customer portal
  • Billing service reconciliation job runs nightly and includes compliance-waiver reconciliation
  • Webhook HMAC verification documented in customer docs
  • Pricing rules seeded for beta tier; compliance seed rules reviewed by Trust & Safety
  • 5 beta customers onboarded and sending production traffic
  • Compliance hold-queue SLO dashboard deployed (queue depth, oldest pending, reviewer response time)

4.4 M3 — Competitive Differentiator (Sprint 9–11)

Capabilities delivered:

  • Admin dashboard: operator CRUD, routing rule management, user/role management, message logs, billing overview, system health
  • Operator management service: Vault-secured credentials, audit trail, health propagation
  • Advanced routing: per-account overrides, cost-based selection, priority, round-robin
  • Analytics pipeline: event ingestion, hourly/daily aggregation, operator + account metrics
  • SMPP connector observability: metrics, structured logging, tracing

Services: admin-dashboard, operator-management-service, routing-engine (advanced), smpp-connector (observability), analytics-service (pipeline)

Frontend: admin-dashboard (Next.js)

Dependencies: M2 complete; Vault configured for operator credentials

Risks & mitigations:

RiskImpactMitigation
Vault availabilityOperator connections failFallback to K8s Secrets (degraded mode)
Routing rule complexityEdge-case bugsProperty-based testing on rule engine
Admin dashboard scope creepDelayed deliveryStrict epic scope; defer nice-to-haves to M5

Acceptance criteria:

  • Admin creates operator via dashboard; SMPP connector binds within 30s
  • Routing rule change reflected in next routing decision within 5s
  • Cost-based routing selects cheapest operator for given prefix
  • Operator health event disables operator in routing within 10s
  • Analytics dashboard shows correct hourly aggregates within 2 hours of events
  • Credential rotation via Vault does not drop SMPP session
  • Tenant compliance score recomputed every 15 min; tier transitions emit compliance.tenant.tier.changed.v1; SUSPENDED → auto-HOLD observed end-to-end
  • Enterprise tenant onboarded via Azure AD OIDC (or Okta SAML) in staging: admin registers discovery URL / metadata, user SSOs into portal, platform JWT issued with idp=tenant-oidc:<tenantId> claim, auth.external_identity.linked.v1 emitted
  • SCIM push from tenant IdP provisions 100 users into Keycloak + mirrors into auth.users within 5 s
  • Admin-dashboard compliance console: author rule, assign to tenant, observe verdict change in real time via SSE

Release checklist:

  • Admin dashboard deployed behind Keycloak realm with platform.* roles; per-environment admin client configured
  • Vault policies configured for operator-management-service and Keycloak Admin REST credentials
  • Analytics aggregation cron verified over 7-day window
  • 20+ customers migrated to tiered pricing
  • At least one enterprise tenant on external OIDC SSO in production
  • Runbooks: "operator credential rotation", "routing rule misconfiguration", "tenant IdP onboarding", "tenant IdP emergency disable", "tenant compliance score override"

4.5 M4 — Full Platform GA (Sprint 12–14)

Capabilities delivered:

  • Notification service: welcome emails, invoice emails, operator alerts, system alerts, delivery preferences
  • Analytics query API with Redis caching
  • Data retention policies enforced (hourly → daily → archive → purge)
  • GDPR compliance: account erasure flow
  • Full observability: every service has health probes, Prometheus metrics, OpenTelemetry traces, Grafana dashboards, Loki log aggregation
  • HPA tuning: load-tested and validated for 500 msg/s sustained
  • Security audit: mTLS between services, Vault for all secrets, no PII in logs
  • Accessibility: keyboard navigation, dark mode (admin dashboard)

Services: notification-service, analytics-service (query API + retention), dlr-processor (observability), all services (hardening)

Frontend: Both portals polished; a11y audit clean.

Dependencies: M3 complete; security audit scheduled

Acceptance criteria:

  • Welcome email sent on customer signup within 60s
  • Invoice email sent on invoice generation
  • Operator-down alert reaches admin within 30s
  • Analytics API returns cached results in < 100ms
  • Data retention: hourly data older than 90d archived; raw events > 365d purged; compliance.audit_log retained ≥ 13 months with partition pruning verified
  • GDPR erasure deletes all PII within 72h of request across auth AND compliance schemas (hold-queue body/to redacted; audit log pseudonymised per regulator guidance)
  • Load test: 500 msg/s sustained for 30 min with compliance enforcement on, AI classification on ≥ 30% of traffic; P99 < 500ms end-to-end; 0 message loss
  • Local LLM P95 inference ≤ 300ms; AI cache hit rate ≥ 70% after 24h warm-up
  • TENANT_AUDIT report generates for a 90-day window in ≤ 5 minutes and is regulator-accepted in dry-run review
  • All services pass security checklist
  • All runbooks reviewed and rehearsed

Release checklist:

  • GA launch checklist signed by engineering + product + security
  • Status page configured
  • On-call rotation established
  • SLA documentation published
  • Customer documentation site live

4.6 M5 — Post-GA Expansion (Sprint 15–18)

Capabilities delivered:

  • Hot-reload SMPP operator config (SMPP-US-015)
  • Multi-currency billing (localized pricing per market)
  • Real-time analytics streaming (WebSocket / SSE)
  • API v2 planning and deprecation strategy
  • Additional MNO integrations (new markets)
  • Performance optimization: connection pooling, batch processing

Services: All services (incremental improvements)

Acceptance criteria:

  • Operator config change applies without SMPP session drop
  • Invoice generated in customer's designated currency
  • Real-time analytics dashboard updates within 5s of event
  • 3+ new MNO operators integrated

5. Competitive Positioning

5.1 Sequencing Advantage

PhaseWhat we haveWhat competitors lack
M1 (Week 10)Working SMS API with DLRMost competitors ship API-only without DLR tracking
M2 (Week 16)Self-service portal + billing + webhooksCompetitors require manual onboarding; no webhook DLR
M3 (Week 22)Intelligent routing + admin tooling + analyticsCommodity APIs have no routing intelligence or admin visibility
M4 (Week 28)GA with SLAs, compliance, full observabilitySmall competitors lack compliance tooling; large ones are slow
M5 (Week 36)Multi-currency, hot-reload, real-time analyticsOperational excellence that compounds with scale

5.2 Key Differentiators

  1. Intelligent routing — Cost-based, priority, round-robin, health-aware failover. Most commodity SMS APIs route statically.
  2. Real-time operator failover — SMPP health events propagate to routing within 10s. Competitors often require manual intervention.
  3. Self-service everything — Customer portal eliminates sales friction. Admin dashboard eliminates ops tickets.
  4. Webhook-first DLR delivery — Customers get proactive DLR callbacks instead of polling.
  5. Telecom-grade reliability — NATS JetStream with durable consumers, exactly-once billing, idempotent processing, DLQ with alerting.
  6. Multi-tenant from day one — Every query, every event, every log scoped to account_id. Enterprise customers get isolation guarantees.

6. Timeline (ASCII Gantt)

Week: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Sprint: S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 S15 S16 S17 S18

M0 Foundation
|████████|
S1 S2

M1 Core SMS Flow
|████████████████|
S3 S4 S5
↑ First SMS delivered

M2 First Sellable Product
|████████████████|
S6 S7 S8
↑ First revenue

M3 Competitive Differentiator
|████████████████|
S9 S10 S11
↑ Enterprise-ready

M4 Full Platform GA
|████████████████|
S12 S13 S14
↑ GA Launch

M5 Post-GA Expansion
|████████████████████████|
S15 S16 S17 S18

═══ SERVICE TIMELINE ═══

Keycloak (infra) |████|
auth-service |████████████████████████████████████| ← IdP abstraction, SSO onboarding, SCIM, Firebase retirement
api-gateway (Kong) |████████████|
compliance-engine |████████████████████████████████| ← scaffold M0; rules+pipeline M1; hold-queue+mgmt M2; scoring M3; AI+reporting M4
compliance-ai (LLM) |████████████| ← M4 onward
sms-orchestrator |████████████████| ← core M1; compliance integration M1; release-path M2
routing-engine |████████████████████████|
smpp-connector |████████████████████████████|
dlr-processor |████████████████████████████████|
billing-service |████████████████████| ← compliance waiver added M2
customer-portal |████████████████████| ← compliance views added M2
webhook-dispatcher |████████████████|
operator-mgmt-svc |████████████|
admin-dashboard |████████████████| ← compliance console added M3
analytics-service |████████████████████| ← compliance analytics added M3
notification-service |████████████████████| ← hold/block alerts brought forward to M2

═══ PARALLEL WORKSTREAMS ═══

Stream A (Core): auth/Keycloak → Kong → orchestrator → smpp → dlr
Stream B (Commerce): billing → portal → webhooks
Stream C (Admin): ops-mgmt → admin-dash → analytics
Stream D (Trust/Safety): compliance-engine → rules → hold-queue → scoring → AI
Stream E (Identity): IdP abstraction → tenant OIDC/SAML SSO → SCIM → Firebase retirement
Stream F (Ops): notif → hardening

7. Sprint-to-Milestone Mapping (Summary)

SprintMilestonePrimary Focus
Sprint 1M0K8s, DB, NATS, Redis, shared packages; Keycloak HA deployment + realm bootstrap; auth-service + IdentityProvider port scaffold; compliance-engine service scaffold
Sprint 2M0Auth hardening, API-key lifecycle, Kong plugins (ADR-0001); compliance-engine compliance schema + mTLS + health; proto + schema freeze
Sprint 3M1Kong SMS send routes, sms-orchestrator core, routing-engine gRPC; compliance-engine EvaluateCompliance handler + KEYWORD/SENDER_ID/GEO rule types
Sprint 4M1SMPP connector (bind, submit_sm, TPS); orchestrator retry + DLQ; orchestrator ↔ compliance gRPC integration (observation mode)
Sprint 5M1DLR processor, SMPP DLR handling, E2E SMS flow validation; compliance fail-closed chaos test + observation-mode rollout sign-off
Sprint 6M2Billing event ingestion + pricing, customer-portal Keycloak SSO + API keys; compliance rule + rule-set management REST + hold-queue insertion
Sprint 7M2Webhook dispatcher, customer portal (test SMS, message logs); hold-queue admin review (RELEASE/REJECT), notification-service compliance-event consumer
Sprint 8M2Invoice generation, billing dashboard, webhook management, beta launch; billing compliance-waiver consumer, customer-portal compliance states + appeals, compliance enforcement turned on
Sprint 9M3Operator-management service, admin-dashboard Keycloak SSO + operator CRUD; auth-service tenant external OIDC SSO (AUTH-EPIC-005)
Sprint 10M3Advanced routing rules, admin routing management, analytics ingestion; auth-service tenant SAML SSO (AUTH-EPIC-006) + SCIM (AUTH-EPIC-007); compliance scoring worker (EP-CE-05)
Sprint 11M3Admin message logs, billing overview, system health, analytics aggregation; admin-dashboard compliance console (ADMDASH-EPIC-008); first enterprise tenant onboarded on external SSO
Sprint 12M4Notification service core, analytics query API; local-LLM (compliance-ai) deployment + provider abstraction (EP-CE-09 core)
Sprint 13M4Data retention, GDPR erasure across auth + compliance, DLR observability; AI_CLASSIFICATION + DLR_ABUSE rule types finished; compliance report generation + audit-log query (EP-CE-06)
Sprint 14M4Security audit, load testing (500 msg/s with compliance + AI on), a11y, GA launch checklist
Sprint 15M5Hot-reload SMPP config, multi-currency billing groundwork; compliance classification accuracy harness (US-CE-041)
Sprint 16M5Real-time analytics, additional MNO integrations; Firebase legacy retirement kickoff — migration wizard + communication
Sprint 17M5API v2 planning, performance optimization; Firebase legacy retirement execution — bulk migrate residual users to Keycloak
Sprint 18M5Polish, documentation, geographic expansion prep; remove FirebaseLegacyProvider adapter; close out legacy tenant migration

8. New Epics Added in v1.1 — At a Glance

Identity (2 new epics, +2 superseding renames):

EpicServiceIntentMilestone
AUTH-EPIC-001 (rebaselined)auth-serviceKeycloak baseline (replaces Firebase baseline)M0
AUTH-EPIC-002 (new)auth-serviceIdP provider abstractionM0
AUTH-EPIC-005 (new)auth-serviceTenant external OIDC SSO (brokered)M3
AUTH-EPIC-006 (new)auth-serviceTenant external SAML 2.0 SSO (brokered)M3
AUTH-EPIC-007 (new)auth-serviceSCIM 2.0 inbound provisioningM3
AUTH-EPIC-008 (new)auth-serviceFirebase legacy retirementM5

Compliance (10 new epics):

EpicServiceIntentMilestone
EP-CE-01compliance-engineService FoundationM0
EP-CE-02compliance-engineRule Engine Core (10 rule types)M1 (subset) → M2 (remainder) → M4 (AI + DLR)
EP-CE-03compliance-engineHold Queue & Manual ReviewM2
EP-CE-04compliance-engineRule & Blocklist Management APIM2
EP-CE-05compliance-engineTenant Scoring & Risk TieringM3
EP-CE-06compliance-engineReporting & AuditM4
EP-CE-07compliance-engineObservability & Production HardeningM1 (subset) → M3 (remainder)
EP-CE-08compliance-engine + sms-orchestratorAsync Pipeline IntegrationM1
EP-CE-09compliance-engine + compliance-aiLocal LLM PlatformM4 (core) → M5 (accuracy harness)
EP-CE-10compliance-engine + customer-portal + admin-dashboardTenant-Facing Web Portal IntegrationM2 (subset) → M3 (subset)

Consumer-side epics for compliance events:

EpicServiceMilestone
ORCH-EPIC-005sms-orchestratorM1
BILL-EPIC-005billing-serviceM2
CUST-EPIC-007customer-portalM2
ADMDASH-EPIC-008admin-dashboardM3
ANLYT-EPIC-005analytics-serviceM3

End of Roadmap.