api-gateway (Kong) — Testing Strategy

Status: populated Owner: TBD (Platform / SRE) Last updated: 2026-04-17 Companion: SERVICE_OVERVIEW · API_CONTRACTS · Service Template

1. Purpose

Define how Kong configuration and the custom plugin are tested. Because Kong is configuration (not NestJS code), the usual "80% unit coverage" target does not apply verbatim; instead we target 100 % route coverage by config lint + integration smoke, plus unit tests for the custom plugin.

2. Configuration contract tests

Run in CI on every PR touching ops/kong/ or any upstream service's OpenAPI.

Check	Tool	Fail condition
decK YAML syntactic validity	`deck file validate`	Invalid YAML / schema
Every Route has an auth plugin (or `public:true` tag)	Custom script	Route without auth
Every Route path corresponds to a documented upstream OpenAPI path	Custom script comparing decK → service OpenAPI	Mismatch
Every Route/Service is tagged with `env`, `owner`	Custom script	Missing tag
No Route points to an external host	Custom script	External host
No plaintext secrets in YAML	`gitleaks` + custom regex	Secret-shaped string
Upstream host resolves in the target cluster	`nslookup` in CI (optional)	Unresolvable

These are the gate: no Kong change merges without passing this matrix.

3. Drift detection (periodic)

A nightly CI job runs deck gateway diff against each live Kong (staging, prod):

No diff → pass.
Diff detected → KongConfigDrift alert + block the next deploy until resolved.

4. Integration tests (staging)

End-to-end tests hitting Kong in staging, covering each route's plugin policies.

Scenario	Expected
`POST /v1/sms/send` with valid JWT	202 from `sms-orchestrator`; `X-Account-Id` forwarded
`POST /v1/sms/send` with valid API key	202; `X-Api-Key-Id` forwarded; `X-Api-Key` stripped
`POST /v1/sms/send` without auth	401 problem+json
`POST /v1/sms/send` with expired JWT	401 problem+json
Burst > per-key limit on `/v1/sms/send`	429 + `Retry-After` + rate-limit headers
`POST /v1/sms/send` with 70 KB body	413
`POST /v1/auth/login` × 6 from same IP within 1 min	429 after 5th
`/admin/*` from non-allow-listed IP	403
`/unknown/path`	404 problem+json
`POST /v1/sms/send` with upstream down	503
`POST /v1/sms/send` with Redis down (read-only route still works)	`/v1/sms/{id}` returns 200 (fail-open); `/v1/sms/send` returns 503 (fail-closed)
W3C `traceparent` end-to-end	Kong + upstream spans linked in trace backend
Body logging on `/v1/sms/send`	Not present in Loki

Tests run on every PR merge and nightly.

5. Smoke tests in CI

Test	Cadence
`curl https://api.staging.ghasi.io/health` returns 200	Every deploy
Synthetic SMS send with internal key	Every 5 min (prod + staging)
JWKS fetch from Kong pod	Every deploy

6. Custom plugin unit tests

For ghasi-api-key-lookup:

80 %+ line coverage required (per platform testing standard).
Test cases:
- Cache hit → no auth-service call.
- Cache miss → upstream call, success path.
- Upstream 404 → reject with 401.
- Upstream 5xx → reject with 503 (do not fail-open).
- Upstream timeout → serve cached result if present; else 503.
- Header injection correctness.
- Metric emission correctness.

Tool: whatever the plugin language dictates (busted for Lua, go test for Go).

7. Load / soak tests

k6 or vegeta scripts against staging at 2× expected peak TPS for 30 min.
Verifies Kong pod HPA scales, rate-limit counters are correct, no memory leak.
Run before any major Kong version upgrade and quarterly.

8. Security tests

OWASP ZAP scan against the public edge.
curl smuggling / header injection tests.
TLS cipher scan via testssl.sh.
Rate-limit bypass attempts (vary IP, fresh keys, rotate UA).

9. Coverage targets

Area	Target
Config lint	100 % of routes
Integration scenarios	100 % of plugin combinations in use
Custom plugin unit	≥ 80 % line coverage
Drift detection	Runs nightly, < 24 h detection

10. Open questions

Should integration suite run against a Kong instance spun up per-PR (ephemeral namespace) or shared staging?
Contract tests against each upstream OpenAPI — generate route stubs automatically vs hand-maintain?

1. Purpose​

2. Configuration contract tests​

3. Drift detection (periodic)​

4. Integration tests (staging)​

5. Smoke tests in CI​

6. Custom plugin unit tests​

7. Load / soak tests​

8. Security tests​

9. Coverage targets​

10. Open questions​