Large-Scale Pipelines, Canonical Data Modeling, ML Risk Signals
Sean Zhang
Security Data Architect & Analytics Lead
Security data architect specializing in enterprise security decision platforms, governed data architectures, and large-scale ingestion pipelines. Experienced in building canonical security data models, audit-ready decision logic, and automation frameworks that translate metrics into controlled actions.
// Certifications
// Technical Expertise
Cloud Platforms
Data Engineering & Databases
Machine Learning & Security Analytics
BI & Visualization
GenAI & Automation
Leadership & Collaboration
// Professional Experience
Data Architect (Security Decision Platform)
PNC Bank Apr 2023 - Present- Architected and owned an enterprise Security Decision Platform (SPOG) across 14 security and risk domains within Enterprise Information Security (WIAM, AppSec, ASM, Certificates, Data Protection, Third-Party Risk, Physical Security)
- Unified dozens of heterogeneous upstream systems into a governed security data lake and canonical decision layer, enabling end-to-end workforce and application risk scoring through cross-domain analytics
- Defined canonical security entities and cross-domain linkage (identity, assets, vulnerabilities, access, controls) enabling analytics not achievable inside individual systems
- Authored platform standards: data contracts, schema/versioning rules, audit-grade retention, deterministic recomputation, and controlled backfills to support regulatory and audit requirements
- Designed ingestion and serving patterns (batch + incremental + event-driven) balancing reliability, change control, and downstream compatibility
- Led platform evolution through migration initiatives introducing S3 landing/curation patterns and governed consumption in Redshift while maintaining continuity for mission-critical downstream consumers
Advisor, Data Management and Governance
Cardinal Health Apr 2022 - Feb 2023- Built and optimized ETL pipelines integrating SQL Server, GCP, Workday, ServiceNow, and third-party APIs using SQL, Alteryx, Python, Dataflow, Cloud Storage, and BigQuery
- Led migration of 50+ ETL workflows from SQL Server to BigQuery, reducing query latency by ~70% and supporting 2–3× larger HR data volumes without performance degradation
- Administered Tableau Server for 900+ HR stakeholders and 10+ C-level leaders, optimizing dashboard performance, strengthening governance, and reducing reporting backlog by 35%
- Developed Python/R predictive models for turnover risk, internal movement, and workforce planning, deployed into 10+ C-level reporting workflows
Data Specialist
St. Luke's University Health Network Jul 2021 - Mar 2022- Built and maintained 40+ SQL/Tableau assets for ED, OR, and Administration from hybrid clinical warehouse sources, supporting daily clinical operations and utilization analytics
- Integrated EPIC Cogito/Caboodle/Clarity and SAP into unified Tableau pipelines, improving data consistency and reducing refresh failures by ~40%
- Engineered Twilio-based call-center data pipelines and reporting for ~20K calls/day with 100+ configurations, improving QA workflows and operational oversight
// Featured Projects
Architected enterprise Security Decision Platform (SPOG) across 14 security and risk domains, unifying heterogeneous upstream systems into a governed security data lake with canonical data models enabling cross-domain workforce and application risk scoring.
Led migration of 50+ ETL workflows from GCP SQL instances to BigQuery, doubling read/write speed and enabling scalability for 300+ stakeholders across HR, Finance and Executives.
// Education & Certifications
Master of Science in Business Analytics
Washington University in St. Louis January 2021GPA: 3.97/4.0
Honors: Knight Scholar (Top 1%), Beta Gamma Sigma
Coursework: Machine Learning and Statistical Modeling
Bachelor of Marketing
Xiamen University June 2019GPA: 3.61/4.0
Coursework: Software Engineering, Marketing
Exchange: McGill University, Montreal, Canada with full scholarship (Top 1%)