Skip to main content

Operator Management Service — Service Risk Register

Status: populated Owner: Platform Engineering Last updated: 2026-04-18

IDRiskLikelihoodImpactMitigationOwner
R-OPS-01Vault outage during operator create leaves PG row without credentialsLowHighCompensating delete of PG row on Vault failure; alert on OpsVaultErrorsEngineering
R-OPS-02SMPP password exposed in logs during error handlingMediumCriticalPino redaction on password field; CI log scanner; code review ruleSecurity
R-OPS-03Stale health cache causes routing to degraded operatorMediumHighRedis TTL 60 s; smpp-connector publishes heartbeat every 10 s; worst-case 70 s staleEngineering + SRE
R-OPS-04Legacy migration introduces duplicate operators (same host/port/systemId under different names)MediumMediumMigration script dry-run mode; duplicate guard rejects; ops team reviews reportEngineering
R-OPS-05Routing rule prefix conflict not caught (concurrent create)LowHighSerializable PG transaction + unique index; conflict checker unit testedEngineering
R-OPS-06Vault K8s SA token expires and is not renewedLowHighVault Agent sidecar renews at 50% TTL; alert on auth failureSRE
R-OPS-07mTLS cert expiry blocks smpp-connector credential refreshLowHighcert-manager auto-renews 30 days before; alert at 14 daysSRE
R-OPS-08NATS config event published but routing-engine misses it (consumer restart gap)LowMediumDurable NATS consumer resumes from offset; routing-engine bootstraps from REST API on cold startEngineering
R-OPS-09Admin with ops:admin scope creates malicious routing rule (insider threat)LowHighAll admin actions audit-logged; anomaly detection on routing rule changes (future)Security
R-OPS-10Vault path policy too broad (lateral access to other services' secrets)LowCriticalPolicy scoped to secret/ops/operators/* only; Vault policy test in CISecurity
R-OPS-11Soft-delete bypassed by direct PG writeLowHighNo service has PG write access to ops schema except OMS; DB user policy enforcedSecurity + DBA