Position: Principal Associate, Machine Learning (Credit Risk & Management)
- Selected for the Discover integration effort, supporting migration of 33M existing accounts onto Capital One’s platform and reporting status directly to senior leadership.
- Owned the Execution Monitoring write-up for the Customer Management Risk Model whitepaper and delivered an anomaly-feature analysis platform that reduced abnormal-pattern detection from manual review to minutes, enabling launch readiness for the Proactive Credit Line Increase program for already onboarded Discover customers.
- Built tri-bureau feature datasets to enable historical scoring and low-volume model tests across multiple acquisition segments (Upmarket, Mainstreet, Student/No Credit), accelerating DS iteration cycles.
- Owned upgrades and maintenance of Kubeflow pipelines supporting 70M+ accounts for a transaction underwriting credit-risk model (Thunder), improving reliability and scalability.
- Optimized Apache Spark configurations to expand ETL backfill capacity from 1 day → 6 months per run across 3 years of retro-scoring to support a v2 market launch.
- Stabilized multiple pipeline stages (root-caused failure modes, improved retries/validation), saving ~10 hours/week for DS partners and improving run success rate.
- Supported in-market model operations (NPSL / “No Preset Spending Limit”) through Kubeflow + Spark performance optimizations.
- Served as first-line owner for daily underwriting ETL/pipeline failures—triaged incidents, restored runs, and partnered with platform/DS teams to unblock experiments quickly.
- Co-led Testing Education & Culture (Code Excellence Initiative): reviewed/piloted curriculum for pytest + property-based testing, increasing testing adoption across teams.
- Mentored 3 incoming employees through CODA, improving onboarding and early productivity.
- Recognized for rapid triage of unannounced upstream breaking changes, preventing significant production downtime.
Tech: Snowflake, Databricks, SQL, AWS, New Relic, Python, Jenkins, Kubeflow, Splunk, Apache Spark