ICE Data Services is looking to define and implement its next-generation data platform, focusing on scalable, self-service data pipelines with strong data quality and governance, to build mission-critical ML and AI data workflows.
Requirements
- Deep expertise with Apache Airflow, including DAG design, performance optimization, and operational management
- Strong understanding of dbt for data transformation, including experience with testing frameworks and deployment strategies
- Proficiency with SQL and Python for data transformation and pipeline development
- Familiarity with Kubernetes for containerized application deployment
- Experience implementing data quality frameworks and automated testing for data pipelines
- Knowledge of Git-based workflows and CI/CD pipelines for data applications
- Experience with stream processing frameworks like Apache Flink or similar technologies
Responsibilities
- Design, build, and maintain our on-premises data orchestration platform using Apache Airflow, dbt, and Apache Flink
- Create self-service capabilities that empower teams across the organization to build and deploy data pipelines without extensive engineering support
- Implement robust data quality testing frameworks that ensure data integrity throughout the entire data lifecycle
- Establish data engineering best practices, including version control, CI/CD for data pipelines, and automated testing
- Collaborate with ML/AI teams to build scalable feature engineering pipelines that support both batch and real-time data processing
- Develop reusable patterns for common data integration scenarios that can be leveraged across the organization
- Work closely with infrastructure teams to optimize our Kubernetes-based data platform for performance and reliability
Other
- 2+ years of professional experience in data engineering
- Ability to work cross-functionally with data scientists, ML engineers, and business stakeholders
- Prior experience in a technical leadership role
- Background in implementing data contracts or schema governance
- Experience with real-time data processing and streaming architectures