Directory Modernization
Zach CardozaTulare, CA
At Optum I led the team that rebuilt the provider data pipeline. The old one was a weekly file-based batch. The new one is a continuous Kafka stream. It processes 3.2 million provider updates a day, cut update latency from two weeks down to four hours, and serves provider search to about 50 million members across 1.7 million providers. We landed it a week ahead of schedule, with zero downtime, and it saves the company roughly $1.2 million a month in processing costs.
- Role
- Engineering Manager
- Employer
- Optum (UnitedHealth Group)
- Dates
- 2023 - 2024
- Team size
- 7 to 20 engineers (core team plus offshore for phase 2)
- Scale
- 3.2M provider record updates per day
- 1.7M providers searchable
- 50M+ members served
- Outcomes
- Latency dropped from two weeks to four hours
- Delivered one week ahead of schedule, zero downtime
- $1.2M per month in processing savings
- Tech
- Scala
- Kafka
- AWS
- Elasticsearch
- PostgreSQL
- Kubernetes
The legacy pipeline was a weekly FTP-and-flat-file batch model that left provider data two weeks stale at the worst end. The new architecture is a continuous Kafka-driven event stream that processes updates as they arrive.
We ran the migration in two phases. Phase one was the ENI population (Individual and Family, Employer-Individual), roughly a third of the provider universe; it ran zero-downtime, parallel-execution-in-production, with output diffing to confirm equal-or-better data quality. Phase two covered the remaining 70% of the population in three months, accommodating per-population credential listing rules, Medicare and Medicaid specific data requirements, and the long tail of provider data normalization. We negotiated 13 additional offshore engineers to make the phase-two timeline achievable.
The hardest part was not the architecture; it was the data. Six months of negotiating the long tail of edge cases in provider records across multiple acquired companies. The result is the highest-quality provider directory in the network.