TESTING_STRATEGY — notification-service

Sibling: APPLICATION_LOGIC · API_CONTRACTS · EVENT_SCHEMAS · SECURITY_MODEL · LOCAL_DEV_SETUP

Strategic anchors: 02 Enterprise Architecture §13 Testing · standards/SERVICE_TEMPLATE

We follow the platform's test-pyramid plus contract testing on both sides of every wire. Coverage gates and CI gates are enforced per the platform CI configuration; a PR cannot merge if any gate is red.

1. Pyramid

Level	Approx. count	Runtime	What it covers
Unit (domain)	600+ tests	<30 s	aggregates, value objects, state machines, invariants, domain services
Unit (application)	300+ tests	<60 s	use cases with in-memory ports
Contract (HTTP)	one per route + per error code	<30 s	OpenAPI ↔ implementation
Contract (events)	one per published + per consumed subject	<30 s	JSON Schema ↔ payload
Contract (webhooks)	one per vendor with positive + negative samples	<30 s	HMAC + parser
Integration (slice)	80+ tests	<3 min	one use case end-to-end against real Postgres + Redis (Testcontainers)
Integration (event flow)	30+ tests	<5 min	consume → enqueue → render → outbox → publish, with NATS/Pub/Sub emulator
End-to-end (slice)	10–20 tests	<10 min	through `bff-backoffice-service` with stub vendors
Performance	nightly	<30 min	enqueue throughput, dispatch concurrency, webhook ingestion
Chaos	weekly	varies	vendor outage, DB failover, Pub/Sub backpressure
Security	per-PR + nightly	varies	static + dynamic + dependency scans

Coverage targets: lines ≥ 85 %, branches ≥ 80 %; domain layer ≥ 95 %; use cases ≥ 90 %.

2. Frameworks and conventions

Vitest for unit/application/contract tests (fast, ESM, native TS).
Testcontainers (Node) for Postgres 16, Memorystore-compatible Redis (redis:7-alpine), Pub/Sub emulator, GCS emulator (fsouza/fake-gcs-server).
MSW for HTTP vendor stubs (SendGrid, Twilio, etc.) at the unit/integration boundary.
Pact (or our internal event-pact tool) for consumer/provider event contract tests with the platform broker.
k6 for performance and load.
gremlin / litmus for chaos.
Playwright for the BFF-driven E2E slices (run in melmastoon E2E suite).

Tests live under tests/ mirroring src/:

tests/
  unit/
    domain/
    application/
  contracts/
    http/
    events/published/
    events/consumed/
    webhooks/
  integration/
    use-cases/
    flows/
  e2e/
  performance/
  chaos/
  fixtures/
  helpers/

3. Domain unit tests (highlights)

Every state machine: every transition asserted positively and negatively (IllegalStateTransitionError).
Every invariant: at least one positive (passes) and one negative (throws) test.
PreferenceGate.evaluate(...): matrix of (channel × category × consent × suppression × quietHours × locale-fallback) with golden expectations.
TemplateRenderer: golden file tests per renderer profile (mjml-handlebars-1, text-handlebars-1, whatsapp-handlebars-1, inapp-handlebars-1); RTL snapshot for Arabic and Pashto; bidi-marker presence assertions; XSS-injection inputs are sanitised.
RateLimiter: token-bucket math; clock injection; window rollovers.
LocalisedFormatter: dates/numbers/currency per locale; DST and Hijri-compatible rendering for Pashto/Persian/Urdu.
Sender validators: PK-PTA registration check; DKIM-required-on-EnqueueNotification check.

4. Application unit tests

Each use case has a *.spec.ts file with in-memory implementations of all ports defined in APPLICATION_LOGIC §1.
Idempotency: replay same request → same response; replay with different body → 409.
Outbox semantics: every domain event added to outbox in the same transaction as the aggregate change; assert via in-memory transactional repo.
ApplyConsumedEventUseCase matrix: for each consumed subject, given a fixture event, the resulting set of EnqueueNotificationUseCase invocations matches the trigger map.
HITL gate: PublishTemplateVersionUseCase rejects without approverUserId when source='ai_drafted'.
AI fallback: when AIClient.fetchAIDraftedContent throws, the deterministic render is used and a metric is incremented (assert via fake recorder).

5. Contract tests

5.1 HTTP / OpenAPI

openapi.v1.yaml is the source of truth. The contract test:

Boots NestJS in test mode with stubbed deps.
Iterates every operation in the spec.
Sends representative requests (positive + each documented error code).
Asserts response shape against the spec via @apidevtools/swagger-cli + ajv.

A separate test (openapi-codegen-drift.test.ts) regenerates types from the spec and asserts identity with the committed types — drift fails CI.

5.2 Published events

Every subject in EVENT_SCHEMAS §3 has a JSON Schema in event-schemas/.... The contract test:

Loads our published-event factory.
Generates 100 representative payloads (boundary + property-based via fast-check).
Validates each against the JSON Schema.
Sends to a local broker; downstream consumer test fixtures in tests/contracts/events/published/<subject>.consumer.fixture.ts express the expected interpretation; an internal event-pact runner asserts.

5.3 Consumed events

For each subject in EVENT_SCHEMAS §5:

Load the upstream service's published JSON Schema (vendored in event-schemas/).
Parse with our tolerant zod schema.
Assert action set (e.g., for reservation.confirmed.v1 → expected list of EnqueueNotificationUseCase calls).

5.4 Vendor webhook contracts

Per vendor in tests/contracts/webhooks/<vendor>/:

valid-signature.test.ts: real-world body/headers fixtures pass HMAC.
invalid-signature.test.ts: tampered/missing header → 401.
replay.test.ts: same body twice → second produces no additional state change.
events-parsing.test.ts: vendor's enum values map to our internal types.
headers-skew.test.ts: > 5 min skew → reject.

Vendor sample fixtures are anonymised real captures (license-clean) committed under tests/fixtures/webhooks/<vendor>/.

6. Integration tests (Testcontainers)

tests/integration/ boots real Postgres/Redis/Pub/Sub-emulator/GCS-emulator per file. Two families:

6.1 Use-case slices

Each use case has at least one slice that:

Migrates a fresh DB schema.
Sets app.tenant_id to a fixture tenant.
Inserts seed projections (tenants_local, recipients, templates).
Executes the use case via the application layer.
Asserts DB state, outbox rows, Pub/Sub-emulator messages, and Redis side-effects.
Re-runs to assert idempotency.

6.2 Flow tests

End-to-end domain flows:

Booking-confirmation flow: publish reservation.confirmed.v1 → consume → enqueue email+sms+whatsapp per preference → render → dispatch through MockEmailPort etc. → simulate vendor-webhook callback → assert delivered.v1 and DB state.
Mobile-key delivery flow: lock_integration.key_credential.issued.v1 → mobile_key.issued.whatsapp with token reference → simulated WhatsApp accepted → delivered.v1.
Dunning flow: billing.subscription.payment_failed.v1 → 3 scheduled rows → tick scheduler → 3 dispatched.
AI-drafted template publish flow: emit ai.draft_content.ready.v1 → register draft → publish without approver → 403 → publish with approver → template.published.v1.
Suggest-only flow: tenant policy suggest_only → enqueue creates scheduled notification + admin in-app review → admin approves → dispatched.
Webhook bounce → suppression flow: mock SendGrid bounce webhook → dedupe → suppression row + bounced.v1 + suppressed.v1.
Quiet-hours deferral flow: enqueue at 23:30 Asia/Kabul with quiet-hours 22–07 → scheduled for 07:00 → scheduler tick at 07:00 → dispatched.
Cancellation propagation flow: scheduled pre-arrival reminder + reservation.cancelled.v1 → row marked obsolete → no send.

6.3 RLS tests

A dedicated suite asserts:

Cross-tenant SELECT returns zero rows even with crafted SQL.
The app role cannot BYPASSRLS (assert by attempting SET ROLE).
Drizzle middleware sets app.tenant_id per transaction; without it, queries return zero rows.

7. End-to-end through BFF

E2E slices live in the melmastoon-e2e suite (separate repo path):

Staff logs into backoffice → opens reservation rsv_* → clicks "Resend confirmation" → expects a new notification to appear in the audit panel within 5 s, status reaching delivered (vendor stub returns delivered immediately).
Staff publishes a tenant template override → preview rendered → test-send delivered to staff's own email → audit row.
Guest receives an opt-out URL in the synthetic email → opens link → confirms → returns to app → marketing toggle is off; future marketing send is suppressed.
Marketing manager creates a marketing batch with 200-row segment → schedule for T+5 min → wait → batch completes; per-row delivery audit visible.

8. Performance tests (k6)

Scenario	Target
Enqueue burst	2 000 req/s for 5 min, single tenant; p95 ≤ 350 ms; zero 5xx
Enqueue sustained	500 req/s for 30 min, 50 tenants; CPU ≤ 60 %; p95 ≤ 250 ms
Dispatch throughput	5 000 msgs/min/channel against vendor stub; queue drains within 60 s
Webhook ingestion	3 000 req/s/vendor for 5 min; p95 ≤ 120 ms; zero loss
WS feed	10 000 concurrent connections per region; 50 events/s push; p95 push latency ≤ 150 ms

Performance gates run nightly on staging with a synthetic dataset; regressions > 15 % fail the build the next morning.

9. Chaos tests (weekly)

Experiment	Expected behaviour
Kill primary Cloud SQL → forced failover	enqueue requests retry through Cloud SQL connector reconnect; backlog drains within 5 min
Pub/Sub topic 503 for 60 s	outbox relay backs off; dispatch worker keeps running; on recovery, no duplicate publishes (event id stable)
Vendor (SendGrid) returns 503 for 5 min	dispatch worker retries with backoff; channel health flips to `degraded` then `down`; fallback vendor takes over; alert fires
Vendor returns 4xx for invalid recipient at 50 % rate	suppressions added for those addresses; main funnel unaffected
Memorystore failover	brief enqueue latency spike; suppression check falls through to DB; no incorrect sends
Webhook flood (100k req/min)	Cloud Armor throttles; `webhook_inbound` write rate caps; no DB OOM
Clock skew on a worker (+10 min)	rate-limit windows still consistent (DB-side counters); affected pod self-detects + alerts

Each chaos experiment has a runbook entry in FAILURE_MODES.

10. Security testing

SAST: eslint-plugin-security, semgrep rules for Node + secrets pattern.
Dependency scan: npm audit, osv-scanner, Snyk; CVEs ≥ high block release.
Container scan: Trivy on every image; CVEs ≥ high block release.
DAST: ZAP baseline against staging weekly; targeted scans on the public webhook + opt-out endpoints.
Secret scan: gitleaks pre-commit + nightly history scan.
Pentest: annual third-party (see SECURITY_MODEL §14).
AI-specific: prompt-injection corpus (~200 cases) run nightly against AIClient.fetchAIDraftedContent with the orchestrator stub configured to forward to a real model under a low-cost canary tenant.

11. Test data

Synthetic data factory @melmastoon/test-fixtures-notification produces deterministic recipients, templates, channels, and scheduled rows.
No production data is ever copied into pre-prod environments. A pre-prod refresh seeds from the synthetic factory.
Personal devices used for "real" channel testing belong to platform staff and are listed in an allowlist; sends to these are categorised system.

12. CI gates

Gate	Required
Lint + format	✅
Unit + application	✅
Contract (HTTP, events, webhooks)	✅
Integration (use-case slices)	✅
Coverage thresholds (85 % lines, 80 % branches; domain ≥ 95 %)	✅
OpenAPI drift	✅
Event-schema drift	✅
Migrations dry-run on staging snapshot	✅
`pii-grep` scan	✅
SAST + secret scan + dependency scan	✅
Container scan	✅
E2E (booking-confirmation, opt-out, template publish)	✅ on `release/*` and main
Performance gates	✅ on `release/*`

13. Production verification

Post-deploy smoke (run automatically by cloud-deploy after a rollout):

GET /api/v1/internal/health returns 200 ok from a fresh pod.
Send a synthetic notification (test tenant) end-to-end; expect delivered.v1 within 60 s.
Send a synthetic webhook (test vendor + signed payload); expect applied.
Outbox lag p95 ≤ 1 s for 5 min.

If any check fails, the rollout is automatically rolled back per DEPLOYMENT_TOPOLOGY §6.

1. Pyramid​

2. Frameworks and conventions​

3. Domain unit tests (highlights)​

4. Application unit tests​

5. Contract tests​

5.1 HTTP / OpenAPI​

5.2 Published events​

5.3 Consumed events​

5.4 Vendor webhook contracts​

6. Integration tests (Testcontainers)​

6.1 Use-case slices​

6.2 Flow tests​

6.3 RLS tests​

7. End-to-end through BFF​

8. Performance tests (k6)​

9. Chaos tests (weekly)​

10. Security testing​

11. Test data​

12. CI gates​

13. Production verification​