Skip to main content

tenant-service — TESTING_STRATEGY

Companion: DOMAIN_MODEL · APPLICATION_LOGIC · SECURITY_MODEL · SERVICE_READINESS

Tenant-service ships only with a green test suite that includes the platform-mandated tenant-isolation, RBAC, and outbox/inbox contract tests. Anything less blocks merge.


1. Pyramid

┌────────────┐
│ E2E │ ~3 % (cross-service journeys)
│ Playwright│
└────────────┘
┌──────────────────────┐
│ Contract │ ~7 %
│ Pact (API + events) │
└──────────────────────┘
┌─────────────────────────────────┐
│ Integration │ ~30 %
│ Postgres + Redis Testcontainers│
└─────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ Unit │ ~60 %
│ Pure-domain (Vitest), no IO, no time, no random │
└─────────────────────────────────────────────────────┘

Coverage gates: statements ≥ 85 %, branches ≥ 80 %, per-aggregate domain coverage ≥ 95 %. Lower thresholds fail CI.


2. Unit Tests (@melmastoon/tenant-domain)

Pure TypeScript. Vitest. No mocks for IO (the domain has no IO). Frozen clock and seeded RNG fixtures.

Key suites:

  • tenant.spec.ts — state machine: pending → active → suspended ⇄ active → closed; rejected transitions throw IllegalTenantStateTransition.
  • tenant-config.spec.ts — invariants TC-1..TC-5; optimistic concurrency on update.
  • organization-unit.spec.ts — kind hierarchy; depth limit; ltree path recompute on move; cycle detection.
  • membership.spec.ts — invariants M-1..M-5; remove path; suspend / reinstate.
  • role.spec.ts — system role immutability; permission grant/revoke; code uniqueness.
  • role-assignment.spec.ts — scope narrowing rule (R-3); last-owner protection.
  • invitation.spec.ts — token compare in constant time; single-use; TTL expiry; rolesProposed flow.
  • policy-engine.spec.ts — RBAC ∧ ABAC composition; deny reasons; predicate library.
  • owner-protection.spec.ts — count-then-block under simulated concurrent removals.
  • role-escalation-guard.spec.ts — cannot grant a permission you do not hold.

Property-based fuzz (Fast-Check):

  • OrganizationUnit.move cannot create a cycle for any random tree.
  • PolicyEngine.check is monotonic: removing a permission cannot turn deny into allow.
  • Invitation.accept returns at most one MembershipCreated.

3. Integration Tests

Run with Testcontainers: ephemeral Postgres 16 + Memorystore-compatible Redis (redis:7-alpine) + a stub Pub/Sub emulator.

3.1 Two-tenant isolation simulator (mandatory)

// test/integration/tenant-isolation.spec.ts
test('every read across every table is RLS-isolated', async () => {
const a = await provisionTenant('a');
const b = await provisionTenant('b');

await seed({ tenantId: a.id, /* memberships, roles, configs, ... */ });
await seed({ tenantId: b.id, /* memberships, roles, configs, ... */ });

await runAs(a, async () => {
for (const table of TENANT_TABLES) {
const rows = await db.query(`SELECT count(*) FROM ${table}`);
expect(rows[0].count).toBe(seedCount(a, table));
const cross = await db.query(`SELECT count(*) FROM ${table} WHERE tenant_id = $1`, [b.id]);
expect(cross[0].count).toBe(0);
}
});
});

Fails the build on any non-zero cross-tenant row. This test runs on every PR and on a 10-min cadence in production against the canary seed tenants.

3.2 Outbox / Inbox

  • outbox.spec.ts — domain mutation + outbox row commit atomically; poller dispatches once; retry on transient failure; dead-letter on permanent failure.
  • inbox.spec.ts — duplicate consumer delivery commits inbox row only once; second delivery is a no-op; (consumerName, eventId) uniqueness enforced.

3.3 RLS edge cases

  • Background job that toggles row_security = off cannot leak via missing tenant_id predicate (lint test executes static analysis on every repo file).
  • current_setting('app.tenant_id', true) returning empty string returns zero rows (not all rows).

3.4 Ltree integrity

  • 1000-node random tree generated; move operations validated: no cycles, depth respected, paths consistent with parent_id.

3.5 Event consumption

  • OnUserRegistered materializes membership for matching pending invitation; idempotent on duplicate delivery.
  • OnSubscriptionCancelled schedules SuspendTenant after grace; cancelled if OnSubscriptionReactivated arrives within window.
  • OnUserDeleted flips memberships across all tenants; emits membership.removed.v1.

3.6 Sagas

  • CloseTenantSaga collects acks from N stub services; flips status only after all acks; alerts on missing ack at deadline.
  • MovePropertySaga pauses writes (downstream stubs verified), executes move, resumes writes.

4. Contract Tests

4.1 API contracts (Pact)

tenant-service is a provider for: bff-backoffice-service, gateway, pricing-service, reservation-service, theme-config-service, notification-service, billing-service, property-service. Each consumer publishes its Pact to the broker; CI runs verification on every push to main of either side.

tenant-service is a consumer of: iam-service (/users lookup, pre-register), notification-service (invite send), ai-orchestrator-service (classify / review).

4.2 Event contracts

JSON Schemas in event-schemas/ (melmastoon.tenant.<event>.v1.json). CI validates every published event in integration tests against its schema. Backward-compat regression: load schema v1 from previous release tag; ensure current producer still satisfies it.


5. End-to-End Tests (Playwright)

Run nightly against the staging environment. Cover the user-visible paths:

  • Operator sign-up & owner provision: super-admin provisions tenant, owner accepts invite, lands in Backoffice.
  • Invite a front-desk clerk: owner sends invite, email link opens accept page, new user joins, sees property switcher with one property.
  • Suspend a member after disciplinary: GM suspends, user sessions revoked, login returns session_revoked.
  • Tenant suspension by billing: billing emits cancelled, after grace tenant flips to suspended, writes blocked, owner sees banner.
  • Chain restructuring: chain operator moves a property between regions; downstream pause + resume verified.
  • Tenant deletion: super-admin starts close saga; downstream services emit acks; tenant flips to closed; sync engine purges desktop cache.

6. Security Tests

TestCoverage
Two-tenant simulator (CI + prod canary)RLS leak
Role escalation negative testsevery (resource, action) pair attempted by every role; matrix snapshot in CI
Cascade delete testsDELETE /tenants/{id} propagates; partial-ack scenario does not flip status prematurely
Last-owner removal under concurrency50 parallel DELETE /role-assignments requests; exactly one succeeds, rest get LAST_OWNER_REMOVAL
Invitation token replayaccepted/expired/revoked invitations all return distinct, leak-free 409
Token brute force10 parallel guesses against valid token hash; constant-time compare; rate limit triggers
SASTSemgrep ruleset melmastoon/tenant; CodeQL JS pack; gates on High
DASTOWASP ZAP baseline against staging; gates on High
AuthZ matrixevery endpoint × every role combination snapshot test in CI
SQL injection fuzzsqlmap smoke against known mutation endpoints in staging
JWT tamperingflipped tid, exp, device_id claims rejected

7. Performance Tests

k6 scripts (in perf/):

  • authz-check.js — 30 000 rps for 5 min; verify p95 ≤ 20 ms, error rate < 0.05 %.
  • config-read.js — 20 000 rps; verify p95 ≤ 25 ms; cache hit rate ≥ 95 %.
  • invite-burst.js — 100 invites/s for 60 s in one tenant; verify rate-limit kicks in cleanly.
  • membership-list.js — pagination 50/page over 100 k members; verify cursor stability.

Run nightly against staging; weekly against a perf-staging clone with prod-like data volumes.


8. Chaos Tests

Quarterly drills:

  • Kill Pub/Sub publisher → outbox grows; restore → drained within RTO.
  • Drop one Postgres replica → reads degrade gracefully via primary; alert fires.
  • Inject 500 ms latency on iam-service /users → invite acceptance falls back to pre-registration path.
  • Suspend AI orchestrator → circuit opens; invites still send; AI signals recorded as "unavailable".

9. CI Quality Gates

Every PR (must all pass):

  • Lint, typecheck, format
  • Unit + integration suites
  • Two-tenant isolation simulator
  • Outbox + inbox contract tests
  • Coverage thresholds
  • Pact provider verification
  • Event schema backward-compat
  • SAST (Semgrep + CodeQL)
  • Secret scan (gitleaks)
  • Migration dry-run on a fresh Postgres
  • License audit

Nightly:

  • E2E suite against staging
  • DAST (ZAP)
  • Performance smoke
  • Synthetic-monitor history report

Release candidate:

  • Full performance load test
  • Chaos drill rerun
  • Pen-test sign-off (per release notes)

10. Test Data & Environments

  • Local dev: docker compose up; seed via pnpm seed:local (1 super admin, 1 chain tenant + 3 properties, 1 single-property tenant). See LOCAL_DEV_SETUP.
  • CI: ephemeral containers; data wiped per test file.
  • Staging: persistent two seed tenants; reset weekly; mirror of prod schema.
  • Prod canary: two long-lived seed tenants reserved for synthetic monitors and isolation probes; never expose customer data.

11. Flake Policy

A test that fails non-deterministically is quarantined within 24 h (skipped + logged in flake-quarantine.md) and must be either fixed within one sprint or deleted. Quarantine list reviewed weekly; > 5 entries blocks the next release until reduced.