cdr-mediation-service — Migration Plan
Version: 1.0 Status: Draft Owner: Commerce + Regulator Liaison + Platform Engineering Last Updated: 2026-04-21 References: SERVICE_OVERVIEW.md, _report.md, SERVICE_READINESS.md
The service is greenfield. Migration focuses on (1) ATRA regulator handshake, (2) schema validation dry-run, (3) bootstrap of archive pipeline with object-lock S3 buckets, and (4) enabling daily exports only after confidence is established.
1. What Is Migrating
| Input | Source | Volume | Notes |
|---|---|---|---|
| DLR events (ongoing) | sms.dlr.inbound NATS stream | ~100 M events/month at steady state | Primary ingest |
| Compliance audit events | compliance.audit.v1 | ~10 M/month | Compliance-context on CDR |
| Billing events | billing.events.v1 | Per-message basis | Billing cross-reference |
| ATRA schema | Regulator Liaison engagement | ~1 schema doc | TAP 3.12 + any Afghan variants |
| SFTP credentials for ATRA drop-box | MoU exchange | 1 keypair per destination | Vault-stored |
| Initial 30-d retrospective | Platform logs | Bootstrap dataset | For seed archive |
2. Migration Phases
Phase 0 — Pre-migration engagement (30 days)
| Step | Owner | Output |
|---|---|---|
| ATRA MoU for CDR submission | Regulator Liaison + Legal | Signed MoU |
| ATRA schema-dry-run: exchange 7 days of sample CDRs in proposed format | Regulator Liaison + Engineering | Schema approval / feedback |
| SFTP credentials exchanged | SRE + Regulator Liaison | Keys in Vault |
| HSM provisioning + key generation for export signing | Security + SRE | Key in HSM; dual-control backup |
| S3 buckets (hot + cold) created with object-lock + cross-region replication | SRE | Buckets operational |
| ClickHouse cluster provisioned | Data Eng | CDR schema deployed |
| 3 Deployments (ingest + batch + exporter) deployed to staging | SRE | Staging healthy |
| Adapter implementations for ATRA schema variants complete | Engineering | Tests pass against ATRA staging |
Phase 1 — Shadow (30 days)
| Step | Owner | Output |
|---|---|---|
| Ingest NATS streams; generate CDRs; retain in hot tier | Service | CDR volume growing |
| Hourly rollups active | Service | Aggregates populated |
| Daily ClickHouse sync active | Service | Analytics working |
| Hash-chain verifier running daily | Service | Clean-run log |
cdr.* events published (except cdr.exported.v1) | Service | Regulator-portal can query status |
Feature flag CDR_EXPORT_ENABLED=false | SRE | No ATRA delivery yet |
Exit criteria. CDR ingest lag P99 ≤ 30 s for 14 consecutive days; chain verifier 100% clean; ClickHouse lag < 10 min; volume matches forecast.
Phase 2 — Export Live (30 days)
| Step | Owner | Output |
|---|---|---|
CDR_EXPORT_ENABLED=true for ATRA SFTP destination | SRE | Daily exports begin |
| First live export delivered + ATRA ACK received | Service + Regulator Liaison | Confirmation logged |
| Monitoring: export ACK SLA (100% within 36 h) | SRE | SLO attainment |
| Weekly regulator call to verify data quality | Regulator Liaison | Issue tracker |
| Chain verifier continues daily | Service | Continued clean |
Exit criteria. 14 consecutive days of ATRA ACKs within 36 h; zero rejections; data quality sign-off from ATRA.
Phase 3 — Full Production (ongoing)
| Step | Owner | Output |
|---|---|---|
| Adjustment (VOID/CORRECT) enabled | Commerce + Finance | Admin workflow live |
| Tenant-facing CDR queries (via analytics-service) | Product | Self-serve analytics |
Revenue-assurance reconciliation with billing (EP-BILL-09) live | Commerce + Finance | Leakage alerts |
| Cold-tier restore drill quarterly | SRE | Verified recovery |
| ATRA partnership ongoing | Regulator Liaison | Quarterly review |
Rollback flags.
CDR_EXPORT_ENABLED: daily export on/off.CDR_ADJUSTMENT_ENABLED: adjustment workflow on/off.CDR_CHAIN_VERIFY_FAIL_FAST: verifier halts on first break (prod: continue + report).
3. ATRA Handshake (Phase 0 detail)
3.1 Schema dry-run exchange
- Ghasi generates 7 days of retrospective CDRs in proposed TAP 3.12 format.
- Ghasi signs with HSM; delivers via SFTP to ATRA staging.
- ATRA team parses + validates; returns feedback within 14 d.
- Ghasi addresses any schema issues + re-submits if needed.
- ATRA formally approves schema — this becomes the contracted schema in MoU.
3.2 SFTP exchange
- Ghasi generates SSH keypair (Ed25519); private key in Vault.
- Ghasi public key shared with ATRA.
- ATRA SFTP drop-box created; Ghasi gets upload path.
- Test upload + retrieval confirmed by both sides.
- Rotation policy: annual; 30-day overlap during rotation.
3.3 HTTPS alternative (future)
- If ATRA offers HTTPS endpoint, same flow with mTLS client cert.
- Adapter supports either per destination.
- Fall-back to SFTP if HTTPS fails.
4. Bootstrap Retrospective CDR
Pre-launch one-shot:
- Extract 30 days of
sms.dlr.inboundfrom NATS archive. - Run through CDR encoder in batch mode (not real-time).
- Hash-chain the bootstrap rows into a genesis partition.
- Archive to S3 cold tier.
- ATRA notified that historical 30-d of CDRs may be requested retrospectively (typically not — ATRA expects forward-only submission).
Bootstrap is audit-tagged source=BOOTSTRAP_RETROSPECTIVE_30D.
5. Downstream Consumer Migration
| Consumer | Change | Timing |
|---|---|---|
regulator-portal-service | Consume cdr.exported.v1 for submission-status panel | Phase 1 |
billing-service | Revenue-assurance reconciliation via cdr.generated.v1 | Phase 2 |
analytics-service | ClickHouse CDR mirror for long-range queries | Phase 1 |
admin-dashboard | CDR admin UI (list, rollup status, export status, adjustments) | Phase 1 |
6. Success Metrics for Migration
| Metric | Target | Measurement |
|---|---|---|
| Phase 0 ATRA MoU signed | Yes | Contract |
| Phase 0 schema dry-run approved | Yes | ATRA feedback |
| Phase 1 ingest lag P99 | ≤ 30 s | Prometheus |
| Phase 1 chain-verifier breaks | 0 | Daily |
| Phase 2 export ACK rate | 100% within 36 h | Per-export log |
| Phase 3 adjustment rate | < 2% of original CDRs | Monthly |
| Phase 3 rev-assurance discrepancy | < 0.1% | Daily reconciliation |
7. Rollback Plan
7.1 During Phase 1 (Shadow)
- No rollback needed. Export stays off by default.
7.2 During Phase 2 (Export Live)
CDR_EXPORT_ENABLED=falsestops new ATRA submissions.- In-flight exports complete their retry cycle.
- Regulator Liaison notified immediately.
7.3 During Phase 3 (Full Production)
CDR_ADJUSTMENT_ENABLED=falsestops new adjustments; existing persist.- Export continues or pauses per Phase 2 rollback.
7.4 Catastrophic (chain break detected)
- Quarantine affected partition.
- Notify regulator within 24 h.
- Investigate root cause.
- Resume exports with new chain partition + audit row documenting the incident.
8. Dependencies
- ATRA MoU (blocker for Phase 2).
- HSM operational (blocker for export).
- S3 with object-lock (blocker for archive).
- ClickHouse cluster (blocker for analytics mirror).
sms.dlr.inboundNATS stream operational (blocker for ingest).billing-serviceEP-BILL-09 (blocker for revenue-assurance, Phase 3).regulator-portal-serviceEP-REG-01 (blocker for regulator view, Phase 1).
9. Post-Launch Refinement
Within 90 days of Phase 3:
- Regulator feedback loop: quarterly data-quality review with ATRA.
- Tune rollup windows based on observed query patterns.
- Optimise S3 archive granularity (hourly files) based on restore frequency.
- ClickHouse query SLA refinement per regulator-portal long-range queries.
- Adjustment playbook refinement based on real-world cases.