Skip to main content

api-gateway (Kong) — Event Schemas

Status: populated Owner: TBD (Platform / SRE) Last updated: 2026-04-17 Companion: SERVICE_OVERVIEW · OBSERVABILITY · Service Template

1. NATS events

Kong produces no NATS events and consumes no NATS events.

All domain events remain the responsibility of upstream services:

  • sms.outbound.request — produced by sms-orchestrator
  • sms.dlr.inbound — produced by smpp-connector
  • billing.events, webhook.dispatch — produced by dlr-processor
  • auth.events — produced by auth-service

See each service's EVENT_SCHEMAS.md.

2. What Kong does emit

Kong emits three telemetry streams, none of which are domain events:

  1. Access logs → Loki (via http-log plugin).
  2. Metrics → Prometheus (via prometheus plugin).
  3. Traces → OpenTelemetry collector (via opentelemetry plugin).

3. Access log schema (Loki)

One JSON object per request, written to Loki with labels {service="kong", env="<env>", route="<route-name>"}.

FieldTypeNotes
timestampRFC3339Request received
request_idstringX-Request-Id (UUIDv7)
trace_idstringW3C trace ID (low-cardinality label; high-cardinality in body)
client_ipstringFirst entry in X-Forwarded-For (Cloudflare-stripped)
methodstringPOST, GET, ...
pathstringRequest path, query string stripped (PII risk)
route_namestringKong Route name (low-cardinality)
service_namestringKong Service name
account_idstring | nullFrom X-Account-Id (injected post-auth)
api_key_idstring | nullFrom X-Api-Key-Id — never the key itself
tierstring | null`free
statusnumberHTTP status returned to client
latency_msnumberTotal request latency
upstream_latency_msnumberTime at upstream
kong_latency_msnumberKong processing time
request_size_bytesnumberRequest body size
response_size_bytesnumberResponse body size
user_agentstringTruncated to 256 chars
refererstringOptional
rate_limit_remainingnumber | nullFrom X-RateLimit-Remaining-*

Never logged:

  • Request or response bodies (PII, SMS message content).
  • Authorization header contents (only presence flag if needed).
  • Full API key (only the api_key_id identifier).
  • Phone numbers, customer message text.

Retention: 14 days hot in Loki, 90 days cold archive. Longer retention requires explicit business justification (audit).

4. Metric names (Prometheus)

Exposed at /metrics on the Kong admin port (scraped by Prometheus). Using the standard prometheus plugin.

MetricTypeLabelsNotes
kong_http_requests_totalcounterservice, route, code, sourceRequest count
kong_http_latency_mshistogramservice, routeEnd-to-end latency
kong_upstream_latency_mshistogramservice, routeUpstream-only latency
kong_kong_latency_mshistogramservice, routeKong processing latency
kong_bandwidth_bytescounterservice, route, directionIngress/egress bytes
kong_nginx_http_current_connectionsgaugestateActive connections
kong_memory_lua_shared_dict_bytesgaugeshared_dictLua SHM usage
kong_rate_limiting_rejected_totalcounterservice, route, limit_byGhasi-specific — emitted by custom plugin or derived from status=429
ghasi_api_key_lookup_totalcounter`result=hitmiss
ghasi_api_key_lookup_latency_secondshistogramCustom plugin
kong_jwks_refresh_totalcounterissuer, resultJWT plugin JWKS refresh

Budget: keep label cardinality low. No per-account_id labels on metrics (high cardinality); that lives in logs. Per-route and per-service labels are fine.

5. OpenTelemetry spans

Kong's opentelemetry plugin emits one server span per request.

AttributeExampleNotes
http.methodPOST
http.route/v1/sms/sendTemplate, not concrete path
http.status_code202
http.user_agentcurl/8.4Truncated
net.peer.ip203.0.113.4Client IP (post-Cloudflare)
kong.servicesvc-sms-orchestratorKong Service name
kong.routert-sms-v1-sendKong Route name
kong.consumercsm-acct_01HW9X...When resolved
ghasi.account_idacct_01HW9X...Custom attribute
ghasi.tierproCustom attribute
traceparentW3CPropagated to upstream

Spans are linked as parent to the upstream service's server span (W3C trace context). Upstream services add their own application spans as children.

Sampling: head-based, 10% default; 100% for 5xx responses; 100% for /v1/auth/login (security-sensitive).

6. Retention class (per platform taxonomy)

StreamClassRetention
Access logstelemetry (PII-scrubbed)14 d hot / 90 d cold
Metricstelemetry30 d hot / 13 mo rolled-up
Tracestelemetry7 d

No events are persisted to PostgreSQL or NATS JetStream by Kong itself.

7. Open questions

  • Do we export access logs to a SIEM in addition to Loki (SOC/compliance)?
  • Span sampling: consider tail-based sampling via OTel collector for error-biased retention.