Consumer Reports is looking to build the data infrastructure that powers advanced AI applications, machine learning models, and analytics systems across CR to transform how they serve consumers.
Requirements
- 3+ years of experience designing and developing data pipelines, data models/schemas, APIs, or services for analytics or ML workloads.
- Skilled in Python, SQL, and have experience with PySpark on large-scale datasets.
- Experience with data orchestration tools such as Airflow, dbt and Prefect, plus CI/CD pipelines for data delivery.
- Experience with Data and AI/ML platforms such as Databricks, AWS SageMaker or similar.
- Experience working with Kubernetes on cloud platforms like - AWS, GCP, or Azure.
Responsibilities
- Design, develop, and maintain ETL/ELT pipelines for structured and unstructured data to support AI/ML model and application development, evaluation, and monitoring.
- Build and optimize data processing workflows in Databricks, AWS SageMaker, or similar cloud platforms.
- Collaborate with AI/ML engineers to deliver clean, reliable datasets for model training and inference.
- Implement data quality, observability, and lineage tracking within the ML lifecycle.
- Develop Data APIs/microservices to power AI applications and reporting/analytics dashboards.
- Support the deployment of AI/ML applications by building and maintaining feature stores and data pipelines optimized for production workloads.
- Ensure adherence to CR's data governance, security, and compliance standards across all AI and data workflows.
Other
- This is a hybrid position.
- This position is not eligible for sponsorship or relocation assistance.
- You are passionate about automation and continuous improvement.
- You have excellent documentation and technical communication skills.
- You are an analytical thinker with troubleshooting abilities.