Large-Scale ETL, Machine Learning
Sean Zhang
Senior Data Engineer & Analytics Manager
Specializing in cloud data platforms (GCP, AWS), security analytics, large-scale ETL, and GenAI-driven automation. Built multi-domain security analytics, engineered pipelines that reduce manual effort, and delivered ML-driven detection improvements that enhance enterprise risk visibility.
// Certifications
// Technical Expertise
Cloud Platforms
Data Engineering & Databases
Machine Learning & Security Analytics
BI & Visualization
GenAI & Automation
Leadership & Collaboration
// Professional Experience
Business Analytics Manager
PNC Bank Feb 2025 - Present- Lead an 8-member engineering and analytics team supporting multiple security domains (WIAM, DLP, PGA, ASM, Digital Identity, ITH), delivering automated workflows and real-time reporting that increased operational visibility and reduced reporting cycles from days to minutes
- Owned automated budgeting workflows for the Information Security organization, managing $200M+ across cost centers and reducing manual deployment and audit rework by ~20% through automated variance tracking and forecasting
- Integrated Microsoft Copilot into Python ETL and BI workflows (Tableau, Power BI), automating code generation and anomaly detection while improving engineering velocity 2× and reducing manual coding effort by 40% for security analytics deliverables
Business Analytics Lead
PNC Bank Apr 2023 - Jan 2025- Owned end-to-end workflow from data engineering to Tableau dashboards, delivering 30+ real-time KRI metrics, automated controls, and issue management for CIO and board-level reporting
- Engineered ETL pipelines across 10+ security and IT systems (AD/OUD, ServiceNow, Tableau, Tenable, Archer) using Python/PySpark/SQL, processing 50M+ monthly records at 99.9% reliability to support real-time operations
- Led automation initiatives that saved $1M+ and 10,000+ hours annually, increasing operational efficiency 3–10x on key programs
- Supported ML models for individual risk scoring, insider fraud, and application risk, reducing false positives by 40%+ and shortening investigation cycles by 60%, improving analyst throughput and reducing investigative workload
- Maintained data platform infrastructure including Jenkins CI/CD, server management, and sensitive data management (GLBA, PII, PCI, HIPAA)
Advisor, Data Management and Governance
Cardinal Health Apr 2022 - Feb 2023- Built and optimized ETL pipelines integrating SQL Server, GCP, Workday, ServiceNow, and third-party APIs using SQL, Alteryx, Python, Dataflow, Cloud Storage, and BigQuery
- Led migration of 50+ ETL workflows from SQL Server to BigQuery, reducing query latency by ~70% and supporting 2–3× larger HR data volumes without performance degradation
- Administered Tableau Server for 900+ HR stakeholders and 10+ C-level leaders, optimizing dashboard performance, strengthening governance, and reducing reporting backlog by 35%
- Developed Python/R predictive models for turnover risk, internal movement, and workforce planning, deployed into 10+ C-level reporting workflows
Data Specialist
St. Luke's University Health Network Jul 2021 - Mar 2022- Built and maintained 40+ SQL/Tableau assets for ED, OR, and Administration from hybrid clinical warehouse sources, supporting daily clinical operations and utilization analytics
- Integrated EPIC Cogito/Caboodle/Clarity and SAP into unified Tableau pipelines, improving data consistency and reducing refresh failures by ~40%
- Engineered Twilio-based call-center data pipelines and reporting for ~20K calls/day with 100+ configurations, improving QA workflows and operational oversight
// Featured Projects
Developed comprehensive analytics platform providing 30+ real-time KRI metrics for information security, integrating data from 10+ APIs with automated ETL pipelines and executive dashboards.
Led migration of 50+ ETL workflows from GCP SQL instances to BigQuery, doubling read/write speed and enabling scalability for 300+ stakeholders across HR, Finance and Executives.
// Education & Certifications
Master of Science in Business Analytics
Washington University in St. Louis January 2021GPA: 3.97/4.0
Honors: Knight Scholar (Top 1%), Beta Gamma Sigma
Coursework: Machine Learning and Statistical Modeling
Bachelor of Marketing
Xiamen University June 2019GPA: 3.61/4.0
Coursework: Software Engineering, Marketing
Exchange: McGill University, Montreal, Canada with full scholarship (Top 1%)