Collective Health is transforming how employers and their people engage with their health benefits by seamlessly integrating cutting-edge technology, compassionate service, and world-class user experience design. The Lead Data Engineer will drive the development of robust, scalable, and efficient data solutions, collaborating closely with cross-functional teams, providing thought leadership on data architecture, mentoring junior engineers, and optimizing the data ecosystem for performance and reliability.
Requirements
- Expertise in building scalable ETL pipelines with Spark (PySpark or Scala) and SQL.
- Deep understanding of data architecture, schema design, and dimensional modeling for analytics and machine learning.
- Proficiency in distributed systems such as Spark, Databricks, or Snowflake.
- Experience with event-driven architectures and streaming platforms like Kafka or Kinesis.
- Security-first mindset – familiarity with data privacy, encryption, and compliance in healthcare or other regulated industries is a plus.
Responsibilities
- Architect Scalable Data Solutions - Design, develop, and optimize large-scale data pipelines using Spark (PySpark, Scala), Databricks, and distributed data processing frameworks.
- Advance Data Modeling & Architecture - Lead the design and evolution of data models to support analytical, operational, and machine-learning requirements.
- Enhance Data Performance & Reliability - Improve data processing performance, scalability, and reliability, while ensuring data quality and governance.
- Drive Cross-Functional Collaboration - Partner with Product, Engineering, Data Science, and Analytics teams to deliver high-impact data solutions that generate actionable business and clinical insights.
- Mentor & Provide Technical Leadership - Guide junior and mid-level engineers, conduct code reviews, and establish best practices in data engineering.
- Ensure Data Governance & Security - Implement robust security, privacy, and compliance measures for sensitive healthcare data, ensuring adherence to industry regulations.
- Influence Data Strategy - Provide input on data infrastructure decisions, emerging technologies, and process improvements.
Other
- 8+ years of data engineering experience in fast-paced, data-driven environments.
- Excellent communication skills – ability to collaborate cross-functionally and translate complex technical concepts into business impact.
- Mentorship experience – experience guiding engineers and fostering a collaborative, inclusive team culture.
- This is a hybrid position based out of one of our offices: San Francisco, CA, Plano, TX, or Lehi, UT. Hybrid employees are expected to be in the office two days per week.