Skip to main content

Campaign Service (campaign-service) — Service Overview

Version: 1.0 Status: Draft Owner: Product Last Updated: 2026-04-20 Companion: DOMAIN_MODEL · API_CONTRACTS · EVENT_SCHEMAS · AI_INTEGRATION Related ADR: ADR-0004 National-Backbone Resilience


1. Purpose — Tenant-Facing Marketing & Notification Orchestration

The Campaign Service is the tenant-facing orchestration plane for high-volume, scheduled, segmented SMS programs — the equivalent of Twilio Engage, MessageBird Studio, or Infobip Moments. Where sms-orchestrator is the per-message API and pipeline, the Campaign Service is the per-program surface: define an audience, choose a template, schedule a window, throttle the firehose, run an A/B test, attach a kill-switch, and watch deliverability and conversion roll in.

It is the canonical home of:

  1. Campaign builder — segment query DSL over recipient profiles; schedule; throttle (TPS / per-MNO ceilings); A/B variants; kill-switch.
  2. Template catalog — versioned, approved tenant templates with merge fields, conditional content, multi-language variants (Pashto / Dari / English / Arabic).
  3. Approved-template workflow — submission → compliance review → approval → publish, paired with EP-CE-13 trusted-tenant fast-path so verified senders skip per-message review.
  4. Campaign reporting — deliverability, spend, opt-outs, conversion (URL-callback or pixel), pivot tables and CSV.

The service is not on the per-message data plane — it stages and submits batches into sms-orchestrator (or channel-router-service for multi-channel) and consults compliance-engine, consent-ledger-service, and analytics-service along the way.


2. Position in the Platform

Tenant Marketer / Ops

▼ https://app.ghasi.af/campaigns
┌────────────────────────┐
│ customer-portal │ (UI only)
└───────────┬────────────┘
│ HTTPS / mTLS

┌─────────────────────────────────────┐
│ campaign-service │
│ │
│ ┌──────────┐ ┌────────────────┐ │
│ │ Builder │ │ Template cat. │ │
│ └──────────┘ └────────────────┘ │
│ ┌──────────┐ ┌────────────────┐ │
│ │ Scheduler │ │ Kill-switch │ │
│ │ + throt │ │ ≤ 5s stop │ │
│ └──────────┘ └────────────────┘ │
│ ┌──────────┐ ┌────────────────┐ │
│ │ A/B alloc │ │ Reporting │ │
│ └──────────┘ └────────────────┘ │
└────────────┬────────────────────────┘

┌──────────────────┬────────────────┼────────────────┬─────────────────┐
▼ ▼ ▼ ▼ ▼
┌──────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ compl.- │ │ consent- │ │ channel- │ │ sms- │ │ analytics- │
│ engine │ │ ledger- │ │ router- │ │ orchestrator │ │ service │
│ (templ. │ │ service │ │ service │ │ (per-msg │ │ (campaign │
│ approv. │ │ (opt-out │ │ (multi-ch │ │ ingest) │ │ reporting) │
│ + fast- │ │ + DND) │ │ fanout) │ │ │ │ │
│ path) │ │ │ │ │ │ │ │ │
└──────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘

The campaign service composes the platform's existing services into a higher-level abstraction; it does not duplicate any of them.


3. Bounded Context

DimensionValue
DomainMarketing / Bulk Notification / Tenant Productivity
Owner squadProduct
Deployment unitKubernetes Deploymentcampaign-service (NestJS API + worker)
Communication styleInbound: HTTPS (customer-portal, tenant API) · Outbound: HTTPS / gRPC to sms-orchestrator, channel-router-service, compliance-engine, consent-ledger-service, analytics-service · NATS for events
StoragePostgreSQL schema campaign · Redis (kill-switch, dedupe, throttle counters) · ClickHouse via analytics-service for reporting reads
Failure modeFail-safe-stop — on dependency failure (compliance, consent), the campaign is paused with a tenant-visible reason; in-flight throttle continues to drain, no new sends queue. The data plane is unaffected.

4. Responsibilities

#Responsibility
R1Provide a segment query DSL (JSON-DSL compiled to SQL against tenant recipient tables) to define audience for a campaign
R2Manage a template catalog with versioning, merge-field syntax (Mustache subset + ICU MessageFormat for plurals), conditional blocks, multi-language variants
R3Operate the approved-template workflow (submit → compliance review → approve / reject) and integrate EP-CE-13 trusted-tenant fast-path
R4Schedule campaigns (one-shot, recurring cron-like, time-zone aware) and dispatch into sms-orchestrator/channel-router-service honouring tenant throttle caps and per-MNO ceilings
R5Run A/B variant assignment with consistent hashing on recipient identifier so the same recipient always gets the same variant within a campaign
R6Provide a kill-switch that halts in-flight dispatch within ≤ 5 s end-to-end (P95) — measured from operator click to last queued message dropped
R7Consult consent-ledger-service to drop opted-out recipients before send; record drop reason for audit
R8Emit campaign.* lifecycle events (submitted, approved, started, paused, killed, completed) and per-recipient send/skip events for reporting
R9Surface campaign reporting (deliverability %, spend, opt-out delta, conversion via webhook callback) sourced from analytics-service and the consent ledger
R10Enforce per-tenant campaign quotas (max active campaigns, max messages/day, segment size cap)

5. Non-Responsibilities

  • Does not dispatch SMS directly — submits to sms-orchestrator or channel-router-service.
  • Does not evaluate per-message compliance — compliance-engine does that on every individual outbound message; the campaign service consults the engine only for template approval.
  • Does not own consent records — consent-ledger-service is authoritative; campaign service is a read-mostly consumer.
  • Does not own billing — billing-service meters spend; campaign service surfaces spend read-only via analytics-service.
  • Does not own recipient profile data beyond metadata needed for segmentation (it queries the tenant's recipient store via a defined contract).

6. Upstream / Downstream Dependencies

DirectionServiceProtocolPurpose
Inbound usercustomer-portal UIHTTPSBuilder, catalog, reporting
Inbound machineTenant APIHTTPS (OAuth/API key)Programmatic campaign create / start / kill
Outboundsms-orchestratorHTTPS POST /v1/sms/bulkBatch SMS submit
Outboundchannel-router-servicegRPCMulti-channel fanout (SMS / WA / Voice)
Outboundcompliance-enginegRPCTemplate approval workflow + trusted-tenant fast-path
Outboundconsent-ledger-servicegRPC BatchCheckConsent(recipients[])Drop opted-out + DND recipients pre-send
Outboundanalytics-servicegRPCCampaign reporting reads
Outboundbilling-servicegRPCPre-flight spend estimate; budget cap check
Outbound eventsNATS JetStreamTCPcampaign.* lifecycle + per-recipient events
Inbound eventsNATS JetStream consent.events.opt_out.v1TCPLive opt-out propagation; mid-flight drop

7. High-Level Flow — Campaign Submit → Run → Complete


8. Key Design Decisions

DecisionRationale
Segment DSL is JSON-AST, not raw SQLAuditable, sandboxable, compiled to parameterised SQL server-side; eliminates SQL injection class
Segment compiler emits EXPLAIN-validated SQL with mandatory tenant-id predicate and a hard row-capPrevents accidental cross-tenant scans; protects RDS from runaway queries
Template merge syntax = Mustache subset + ICU MessageFormat plurals (e.g. {count, plural, one {1 message} other {# messages}})Mustache is familiar; ICU handles linguistic plurals correctly across en/ps/fa/ar
A/B assignment via consistent hashing of recipientId (sha256(campaignId:recipientId) mod 100)Same recipient always lands in the same variant for the campaign; deterministic without a join table
Kill-switch latency budget = 5 s end-to-end (P95)Measured from kill click to last orchestrator submit; achieved by Redis-backed kill flag checked per-batch and worker poll loop ≤ 250 ms
Per-MNO throttle ceilings enforced (e.g. AWCC 100 TPS, Roshan 80 TPS) via token bucket with Redis backingProtects MNO peers from tenant batch storms; respects interconnect SLAs
Approved-template workflow honours EP-CE-13 trusted-tenant fast-pathEstablished tenants don't get blocked at template approval gate; per-message compliance still runs
Pre-flight consent batch check in chunks of 1000Reduces opt-out leakage by checking immediately before send rather than at submission time
Live opt-out drop via consumed consent.events.opt_out.v1 events while a campaign runsMid-flight opt-outs are honoured within seconds, not at the next campaign
Reporting reads come from analytics-service (ClickHouse)Keeps PG schema small and lets reporting aggregate across multiple sources (DLR, opt-out, conversion)
Conversion tracked via signed callback URL or 1×1 pixel — tenant-suppliedDoesn't require integrating a tracking pipeline inside the campaign service
Campaign cannot enter RUNNING without an approved template and a successful batch consent checkPrevents both regulator violations and pointless dispatch cost
Campaigns dispatching to > 1000 recipients require explicit operator confirmation in the UI ("type the campaign name to confirm")Reduces ops error blast-radius

9. Runtime Topology

ComponentStackReplicas (prod)Notes
Campaign API (NestJS)Node 22 / NestJS 113CRUD, builder, reporting
Campaign workerNode 224Scheduler, throttler, dispatch loop, kill-switch poll
Approval workerNode 222Template lifecycle, fast-path resolver
PostgreSQLPostgres 161 primary + 2 replicascampaign schema
RedisRedis 7 cluster3 nodesKill-flag, throttle tokens, dedupe
ClickHousevia analytics-servicen/aReporting reads

10. Aggregates Owned

  • Campaign — lifecycle root: state machine (DRAFT → SCHEDULED → RUNNING → PAUSED|KILLED|COMPLETED), schedule, throttle, segment ref, template ref, A/B config
  • CampaignSegment — JSON-DSL definition, compiled SQL fingerprint, last preview row count
  • CampaignTemplate — versioned content per language, merge fields, approval state, links to compliance-engine template entity
  • CampaignBatch — per-1000-message dispatch unit; tracks accepted/rejected/dropped counts
  • CampaignVariant — A/B variant config (allocation %, content delta, success metric)
  • TemplateApproval — submission → review → approve/reject lifecycle

11. Standards & Compliance

  • GDPR / Afghan Data Protection — opt-out propagation < 24 h (target seconds via live event consumption)
  • TCPA-like principles — no marketing to opted-out recipients, ever
  • Tenant data isolation — RLS on every table by tenantId
  • Audit log of every state transition, kill click, and template approval — append-only, partitioned monthly

12. Cross-Service Contracts (summary)

  • Submits batches to sms-orchestrator.POST /v1/sms/bulk
  • Calls compliance-engine.SubmitTemplate / ApproveTemplate / IsTrustedTenant
  • Calls consent-ledger-service.BatchCheckConsent
  • Emits campaign.created/submitted/started/paused/killed/completed/batch_dispatched
  • Consumes consent.events.opt_out.v1 for live mid-flight drops
  • Consumes compliance.template.approved/rejected events

13. Out-of-Scope (v1.0)

  • AI-generated campaign content (deferred — see AI_INTEGRATION.md §2)
  • Multi-step drip / nurture sequences (v1.1)
  • Inbound MO conversational replies tied to campaign attribution (v1.1; will use channel-router-service.session-manager)

14. Glossary

TermDefinition
SegmentSet of recipients defined by a JSON-DSL query against tenant recipient profiles
ThrottleRate cap on messages submitted per second / per minute, optionally per MNO
A/B variantAlternative content tested for performance; assignment is consistent per recipient
Kill-switchOperator-triggered halt of in-flight dispatch within ≤ 5 s
Trusted-tenant fast-pathEP-CE-13 mechanism that bypasses per-message template re-review for established senders

15. Companion documents