CVS Health is looking to solve the problem of transforming healthcare by building a world of health around every consumer, requiring scalable data pipelines to support their operations.
Requirements
- Strong proficiency in Google Cloud Platform (GCP) services, particularly BigQuery, Cloud Storage, Cloud Composer (Airflow), and Dataproc.
- Solid experience with ETL/ELT development using Python and SQL.
- Familiarity with distributed data processing frameworks such as Apache Spark or Hadoop.
- Strong understanding of data modeling, data warehousing, and query performance optimization.
- Experience with CI/CD practices, infrastructure as code (e.g., Terraform), and version control systems like Git.
- Exposure to streaming data architectures using Pub/Sub, Dataflow, or similar technologies.
- Familiarity with data quality, monitoring, and governance best practices.
Responsibilities
- designing, building, and maintaining scalable data pipelines
- ETL/ELT development using Python and SQL
- data modeling
- data warehousing
- query performance optimization
- CI/CD practices
- infrastructure as code (e.g., Terraform)
Other
- Ability to work independently and collaboratively in cross-functional teams, including data scientists, analysts, and product engineers.
- Effective communication skills and a proactive approach to problem-solving.
- Experience mentoring junior engineers or contributing to team-wide technical improvements.
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in data engineering