Aug 2018 – Jun 2019

Data Engineer

Highlights

Reduced Spark compute costs by 60% by migrating from managed EC2 to EMR for data and predictive analytics.
Integrated 3 new data sources from SQL Server, Splunk, and Internal APIs into existing Spark pipeline.
Migrated version control from Perforce to Git-based BitBucket.

PythonSparkAWS EMRSQL ServerSplunk