TESTING_STRATEGY — payment-gateway-service

Sibling: APPLICATION_LOGIC · LOCAL_DEV_SETUP · DEPLOYMENT_TOPOLOGY

Testing for payment-gateway-service is layered to give us deterministic, fast feedback on domain behavior while still validating real vendor integrations before each release. Money services demand high coverage and zero tolerance for silent failures, so the gates are explicit, scriptable, and run in CI.

1. Pyramid

        e2e (real vendor sandboxes)        — nightly, ~5%
   ──────────────────────────────────
   contract  (vendor recordings + Pact)    — every PR, ~15%
  ──────────────────────────────────────
  integration  (Postgres + Pub/Sub + mocks) — every PR, ~25%
 ────────────────────────────────────────────
 unit (domain + use cases + adapters)      — every commit, ~55%

Coverage gates: lines ≥ 90%, branches ≥ 85%, mutation score ≥ 70% (Stryker on the domain & application layers). The adapter shells are excluded from mutation testing since they're thin wrappers; their logic is exercised by recorded contract tests.

2. Unit tests

Tooling: Vitest + fast-check (property-based) + ts-mockito (only where DI of ports requires it). Each test file mirrors the source file path under test/.

2.1 Domain layer (DOMAIN_MODEL)

State machine transitions for Transaction, Webhook, Chargeback enumerated and exhaustively tested (describe.each).
Money invariants verified via property-based tests: addition is commutative, currency mismatch throws, no negative micro counts, no precision drift across 10^9 random pairs.
Domain errors carry stable codes matching MELMASTOON.PAYMENT.* constants (snapshot test against ERROR_CODES.md).

2.2 Application layer (APPLICATION_LOGIC)

Each use case has a "happy path", "idempotency replay", "domain rejection", "adapter failure", "compensation path", and "concurrency conflict" test minimum.
Ports are stubbed with in-memory implementations from src/test/fixtures/ports/; the same fixtures power the integration layer.

2.3 Adapter layer

Decoding fixtures are real but scrubbed vendor responses (no PAN, no CVV) committed under test/fixtures/<vendor>/.
Adapter logic tested via parametrized cases: success, decline, network error, malformed body, signature failure (for webhook signers).

2.4 PCI scanner unit test

A custom test enforces:

import { glob, readFile } from 'node:fs/promises';
import { describe, it, expect } from 'vitest';

describe('PCI: no card-data identifiers in source or fixtures', () => {
  it('contains no PAN-shaped or forbidden field names', async () => {
    const forbidden = /\b(pan|cardnumber|fullnumber|cvv|cvc|cv2|track1|track2|pinblock)\b/i;
    const luhn = /\b(?:\d[ -]*?){13,19}\b/;
    for (const f of await glob('src/**/*.{ts,json}')) {
      const txt = await readFile(f, 'utf8');
      expect(txt, `forbidden id in ${f}`).not.toMatch(forbidden);
      expect(txt, `pan-shaped digits in ${f}`).not.toMatch(luhn);
    }
  });
});

This test runs in CI, in the pre-commit hook, and is required to pass.

3. Integration tests

Tooling: Vitest + Testcontainers (Postgres 16, Pub/Sub emulator) + vendor mocks (stripe-mock, paypal-mock, our home-built hesabpay-mock).

3.1 Postgres

Each test spins up a fresh DB, applies central + tenant migrations, and provisions two tenant schemas. Tests assert per-tenant isolation by attempting cross-tenant SQL via the payments_app role and expecting permission denied.
Outbox flush is exercised against the real Pub/Sub emulator.

3.2 Webhook ingestion

For each vendor:

A fixture HTTP request is signed with the right algorithm and a known secret.
Asserted: signature passes, inbox row created, dispatcher applies the right transition, outbound event has the right shape.
Replay: the same envelope sent twice → first applies, second emits duplicate_dropped.v1 and does not double-mutate.

3.3 Idempotency

For authorize, capture, refund: a request body is sent with the same Idempotency-Key ten times concurrently; assert exactly one adapter call, ten 2xx responses, identical body.

Same key, different body → 409 MELMASTOON.SYNC.IDEMPOTENCY_KEY_REUSED. The diff is included in the test.

3.4 Double-charge attempt

A reservation issues two held.v1 events with the same paymentMethodId and overlapping windows → assert exactly one captured transaction (the second is short-circuited at the saga inbox dedupe step).

3.5 Concurrency

Two callers simultaneously call RefundPaymentUseCase for the same payment with different reasons. Assert exactly one succeeds, the other gets 409 MELMASTOON.PAYMENT.OPTIMISTIC_CONFLICT. Domain invariants (balance never goes negative) verified after the storm with a stochastic 100-run test.

4. Contract tests

Tooling: Pact for our consumer/provider pairs (reservation-service and billing-service consume our events; we consume reservation-service events). Per-vendor:

Stripe: pinned SDK version, recorded fixtures from sandbox covering all event types we consume; replayed via stripe-mock in unit/integration. The actual Stripe API contract is verified nightly via the e2e suite (§6).
PayPal: similar; we record fixtures monthly into test/fixtures/paypal/.
HesabPay: we own the mock since they have no public mock; the mock is reviewed against their docs quarterly.

A failing contract test is not auto-fixed; it triggers a vendor-update task and a feature-flag rollback if needed.

5. Saga / event tests

Using the in-memory event bus harness, the booking saga is replayed end-to-end:

held.v1 → AuthorizePayment → transaction.authorized.v1
confirmed.v1 → CapturePayment → transaction.captured.v1
cancelled.v1 → RefundPayment → transaction.refunded.v1

Compensation paths are explicitly tested: capture fails → reservation receives transaction.failed.v1 and emits cancelled.v1; service handles void.

6. End-to-end (sandbox)

Nightly cron in CI against vendor sandboxes:

Tokenize a Stripe test card (4242 4242 4242 4242) via headless browser hitting the BFF.
Authorize → 3DS challenge accepted programmatically (Stripe pi_*_redirect_required + confirm_payment_intent).
Capture, then refund, then verify webhook receipt.
Same for PayPal sandbox and HesabPay sandbox.
Reconciliation pulls the sandbox settlement report and matches the test transactions.

A failure pages the on-call engineer; a recovered passing run auto-clears the alert.

7. PCI compliance scan

Static: the unit-test scanner above runs on every PR.
Secrets: gitleaks with a custom payments ruleset runs on every PR and on push to main.
Dynamic: ASV scan quarterly on the public surface (handled by SecOps; we participate in remediation).
Penetration test: annual; results tracked in SERVICE_RISK_REGISTER.md.

8. Performance & load

k6 scripts under test/load/ exercise:
- 100 RPS authorize for 10 m against stripe-mock — assert p99 < 1500 ms.
- Burst 500 RPS reads (transaction GET) — assert p99 < 200 ms.
- 1000 webhooks/min sustained — assert inbox lag < 60 s.
These run weekly in a perf environment; thresholds gate any architectural change touching the hot path.

9. Mutation testing

Stryker is configured for the domain and application layers. New code paths must clear the 70% mutation score. Mutators excluded: string-literal mutators (noisy), arithmetic mutators on Money (covered exhaustively by property-based tests).

10. Test data hygiene

Synthetic guests, properties, and tenants are minted via @melmastoon/test-data factories.
No production data is ever copied into test environments.
Fixtures are scanned for PAN-shaped digits; the scanner blocks commits.

11. Local test runner

pnpm test                  # unit + integration (Testcontainers)
pnpm test:contract         # Pact + recorded vendor fixtures
pnpm test:load             # k6 against local stripe-mock
pnpm test:e2e:sandbox      # against real vendor sandboxes (requires creds)
pnpm pci:scan              # standalone PCI hygiene
pnpm coverage              # full coverage report

CI runs the first three on every PR; the e2e:sandbox runs nightly and on release branches.

1. Pyramid​

2. Unit tests​

2.1 Domain layer (DOMAIN_MODEL)​

2.2 Application layer (APPLICATION_LOGIC)​

2.3 Adapter layer​

2.4 PCI scanner unit test​

3. Integration tests​

3.1 Postgres​

3.2 Webhook ingestion​

3.3 Idempotency​

3.4 Double-charge attempt​

3.5 Concurrency​

4. Contract tests​

5. Saga / event tests​

6. End-to-end (sandbox)​

7. PCI compliance scan​

8. Performance & load​

9. Mutation testing​

10. Test data hygiene​

11. Local test runner​