Skip to main content

TESTING_STRATEGY — notification-service

Sibling: APPLICATION_LOGIC · API_CONTRACTS · EVENT_SCHEMAS · SECURITY_MODEL · LOCAL_DEV_SETUP

Strategic anchors: 02 Enterprise Architecture §13 Testing · standards/SERVICE_TEMPLATE

We follow the platform's test-pyramid plus contract testing on both sides of every wire. Coverage gates and CI gates are enforced per the platform CI configuration; a PR cannot merge if any gate is red.


1. Pyramid

LevelApprox. countRuntimeWhat it covers
Unit (domain)600+ tests<30 saggregates, value objects, state machines, invariants, domain services
Unit (application)300+ tests<60 suse cases with in-memory ports
Contract (HTTP)one per route + per error code<30 sOpenAPI ↔ implementation
Contract (events)one per published + per consumed subject<30 sJSON Schema ↔ payload
Contract (webhooks)one per vendor with positive + negative samples<30 sHMAC + parser
Integration (slice)80+ tests<3 minone use case end-to-end against real Postgres + Redis (Testcontainers)
Integration (event flow)30+ tests<5 minconsume → enqueue → render → outbox → publish, with NATS/Pub/Sub emulator
End-to-end (slice)10–20 tests<10 minthrough bff-backoffice-service with stub vendors
Performancenightly<30 minenqueue throughput, dispatch concurrency, webhook ingestion
Chaosweeklyvariesvendor outage, DB failover, Pub/Sub backpressure
Securityper-PR + nightlyvariesstatic + dynamic + dependency scans

Coverage targets: lines ≥ 85 %, branches ≥ 80 %; domain layer ≥ 95 %; use cases ≥ 90 %.


2. Frameworks and conventions

  • Vitest for unit/application/contract tests (fast, ESM, native TS).
  • Testcontainers (Node) for Postgres 16, Memorystore-compatible Redis (redis:7-alpine), Pub/Sub emulator, GCS emulator (fsouza/fake-gcs-server).
  • MSW for HTTP vendor stubs (SendGrid, Twilio, etc.) at the unit/integration boundary.
  • Pact (or our internal event-pact tool) for consumer/provider event contract tests with the platform broker.
  • k6 for performance and load.
  • gremlin / litmus for chaos.
  • Playwright for the BFF-driven E2E slices (run in melmastoon E2E suite).

Tests live under tests/ mirroring src/:

tests/
unit/
domain/
application/
contracts/
http/
events/published/
events/consumed/
webhooks/
integration/
use-cases/
flows/
e2e/
performance/
chaos/
fixtures/
helpers/

3. Domain unit tests (highlights)

  • Every state machine: every transition asserted positively and negatively (IllegalStateTransitionError).
  • Every invariant: at least one positive (passes) and one negative (throws) test.
  • PreferenceGate.evaluate(...): matrix of (channel × category × consent × suppression × quietHours × locale-fallback) with golden expectations.
  • TemplateRenderer: golden file tests per renderer profile (mjml-handlebars-1, text-handlebars-1, whatsapp-handlebars-1, inapp-handlebars-1); RTL snapshot for Arabic and Pashto; bidi-marker presence assertions; XSS-injection inputs are sanitised.
  • RateLimiter: token-bucket math; clock injection; window rollovers.
  • LocalisedFormatter: dates/numbers/currency per locale; DST and Hijri-compatible rendering for Pashto/Persian/Urdu.
  • Sender validators: PK-PTA registration check; DKIM-required-on-EnqueueNotification check.

4. Application unit tests

  • Each use case has a *.spec.ts file with in-memory implementations of all ports defined in APPLICATION_LOGIC §1.
  • Idempotency: replay same request → same response; replay with different body → 409.
  • Outbox semantics: every domain event added to outbox in the same transaction as the aggregate change; assert via in-memory transactional repo.
  • ApplyConsumedEventUseCase matrix: for each consumed subject, given a fixture event, the resulting set of EnqueueNotificationUseCase invocations matches the trigger map.
  • HITL gate: PublishTemplateVersionUseCase rejects without approverUserId when source='ai_drafted'.
  • AI fallback: when AIClient.fetchAIDraftedContent throws, the deterministic render is used and a metric is incremented (assert via fake recorder).

5. Contract tests

5.1 HTTP / OpenAPI

openapi.v1.yaml is the source of truth. The contract test:

  1. Boots NestJS in test mode with stubbed deps.
  2. Iterates every operation in the spec.
  3. Sends representative requests (positive + each documented error code).
  4. Asserts response shape against the spec via @apidevtools/swagger-cli + ajv.

A separate test (openapi-codegen-drift.test.ts) regenerates types from the spec and asserts identity with the committed types — drift fails CI.

5.2 Published events

Every subject in EVENT_SCHEMAS §3 has a JSON Schema in event-schemas/.... The contract test:

  1. Loads our published-event factory.
  2. Generates 100 representative payloads (boundary + property-based via fast-check).
  3. Validates each against the JSON Schema.
  4. Sends to a local broker; downstream consumer test fixtures in tests/contracts/events/published/<subject>.consumer.fixture.ts express the expected interpretation; an internal event-pact runner asserts.

5.3 Consumed events

For each subject in EVENT_SCHEMAS §5:

  1. Load the upstream service's published JSON Schema (vendored in event-schemas/).
  2. Parse with our tolerant zod schema.
  3. Assert action set (e.g., for reservation.confirmed.v1 → expected list of EnqueueNotificationUseCase calls).

5.4 Vendor webhook contracts

Per vendor in tests/contracts/webhooks/<vendor>/:

  • valid-signature.test.ts: real-world body/headers fixtures pass HMAC.
  • invalid-signature.test.ts: tampered/missing header → 401.
  • replay.test.ts: same body twice → second produces no additional state change.
  • events-parsing.test.ts: vendor's enum values map to our internal types.
  • headers-skew.test.ts: > 5 min skew → reject.

Vendor sample fixtures are anonymised real captures (license-clean) committed under tests/fixtures/webhooks/<vendor>/.


6. Integration tests (Testcontainers)

tests/integration/ boots real Postgres/Redis/Pub/Sub-emulator/GCS-emulator per file. Two families:

6.1 Use-case slices

Each use case has at least one slice that:

  1. Migrates a fresh DB schema.
  2. Sets app.tenant_id to a fixture tenant.
  3. Inserts seed projections (tenants_local, recipients, templates).
  4. Executes the use case via the application layer.
  5. Asserts DB state, outbox rows, Pub/Sub-emulator messages, and Redis side-effects.
  6. Re-runs to assert idempotency.

6.2 Flow tests

End-to-end domain flows:

  • Booking-confirmation flow: publish reservation.confirmed.v1 → consume → enqueue email+sms+whatsapp per preference → render → dispatch through MockEmailPort etc. → simulate vendor-webhook callback → assert delivered.v1 and DB state.
  • Mobile-key delivery flow: lock_integration.key_credential.issued.v1mobile_key.issued.whatsapp with token reference → simulated WhatsApp accepted → delivered.v1.
  • Dunning flow: billing.subscription.payment_failed.v1 → 3 scheduled rows → tick scheduler → 3 dispatched.
  • AI-drafted template publish flow: emit ai.draft_content.ready.v1 → register draft → publish without approver → 403 → publish with approver → template.published.v1.
  • Suggest-only flow: tenant policy suggest_only → enqueue creates scheduled notification + admin in-app review → admin approves → dispatched.
  • Webhook bounce → suppression flow: mock SendGrid bounce webhook → dedupe → suppression row + bounced.v1 + suppressed.v1.
  • Quiet-hours deferral flow: enqueue at 23:30 Asia/Kabul with quiet-hours 22–07 → scheduled for 07:00 → scheduler tick at 07:00 → dispatched.
  • Cancellation propagation flow: scheduled pre-arrival reminder + reservation.cancelled.v1 → row marked obsolete → no send.

6.3 RLS tests

A dedicated suite asserts:

  • Cross-tenant SELECT returns zero rows even with crafted SQL.
  • The app role cannot BYPASSRLS (assert by attempting SET ROLE).
  • Drizzle middleware sets app.tenant_id per transaction; without it, queries return zero rows.

7. End-to-end through BFF

E2E slices live in the melmastoon-e2e suite (separate repo path):

  • Staff logs into backoffice → opens reservation rsv_* → clicks "Resend confirmation" → expects a new notification to appear in the audit panel within 5 s, status reaching delivered (vendor stub returns delivered immediately).
  • Staff publishes a tenant template override → preview rendered → test-send delivered to staff's own email → audit row.
  • Guest receives an opt-out URL in the synthetic email → opens link → confirms → returns to app → marketing toggle is off; future marketing send is suppressed.
  • Marketing manager creates a marketing batch with 200-row segment → schedule for T+5 min → wait → batch completes; per-row delivery audit visible.

8. Performance tests (k6)

ScenarioTarget
Enqueue burst2 000 req/s for 5 min, single tenant; p95 ≤ 350 ms; zero 5xx
Enqueue sustained500 req/s for 30 min, 50 tenants; CPU ≤ 60 %; p95 ≤ 250 ms
Dispatch throughput5 000 msgs/min/channel against vendor stub; queue drains within 60 s
Webhook ingestion3 000 req/s/vendor for 5 min; p95 ≤ 120 ms; zero loss
WS feed10 000 concurrent connections per region; 50 events/s push; p95 push latency ≤ 150 ms

Performance gates run nightly on staging with a synthetic dataset; regressions > 15 % fail the build the next morning.


9. Chaos tests (weekly)

ExperimentExpected behaviour
Kill primary Cloud SQL → forced failoverenqueue requests retry through Cloud SQL connector reconnect; backlog drains within 5 min
Pub/Sub topic 503 for 60 soutbox relay backs off; dispatch worker keeps running; on recovery, no duplicate publishes (event id stable)
Vendor (SendGrid) returns 503 for 5 mindispatch worker retries with backoff; channel health flips to degraded then down; fallback vendor takes over; alert fires
Vendor returns 4xx for invalid recipient at 50 % ratesuppressions added for those addresses; main funnel unaffected
Memorystore failoverbrief enqueue latency spike; suppression check falls through to DB; no incorrect sends
Webhook flood (100k req/min)Cloud Armor throttles; webhook_inbound write rate caps; no DB OOM
Clock skew on a worker (+10 min)rate-limit windows still consistent (DB-side counters); affected pod self-detects + alerts

Each chaos experiment has a runbook entry in FAILURE_MODES.


10. Security testing

  • SAST: eslint-plugin-security, semgrep rules for Node + secrets pattern.
  • Dependency scan: npm audit, osv-scanner, Snyk; CVEs ≥ high block release.
  • Container scan: Trivy on every image; CVEs ≥ high block release.
  • DAST: ZAP baseline against staging weekly; targeted scans on the public webhook + opt-out endpoints.
  • Secret scan: gitleaks pre-commit + nightly history scan.
  • Pentest: annual third-party (see SECURITY_MODEL §14).
  • AI-specific: prompt-injection corpus (~200 cases) run nightly against AIClient.fetchAIDraftedContent with the orchestrator stub configured to forward to a real model under a low-cost canary tenant.

11. Test data

  • Synthetic data factory @melmastoon/test-fixtures-notification produces deterministic recipients, templates, channels, and scheduled rows.
  • No production data is ever copied into pre-prod environments. A pre-prod refresh seeds from the synthetic factory.
  • Personal devices used for "real" channel testing belong to platform staff and are listed in an allowlist; sends to these are categorised system.

12. CI gates

GateRequired
Lint + format
Unit + application
Contract (HTTP, events, webhooks)
Integration (use-case slices)
Coverage thresholds (85 % lines, 80 % branches; domain ≥ 95 %)
OpenAPI drift
Event-schema drift
Migrations dry-run on staging snapshot
pii-grep scan
SAST + secret scan + dependency scan
Container scan
E2E (booking-confirmation, opt-out, template publish)✅ on release/* and main
Performance gates✅ on release/*

13. Production verification

Post-deploy smoke (run automatically by cloud-deploy after a rollout):

  1. GET /api/v1/internal/health returns 200 ok from a fresh pod.
  2. Send a synthetic notification (test tenant) end-to-end; expect delivered.v1 within 60 s.
  3. Send a synthetic webhook (test vendor + signed payload); expect applied.
  4. Outbox lag p95 ≤ 1 s for 5 min.

If any check fails, the rollout is automatically rolled back per DEPLOYMENT_TOPOLOGY §6.