Intercontinental Exchange is seeking a Senior Data Engineer to define and implement a next-generation, on-premises data platform, focusing on scalable, self-service data pipelines with strong data quality and governance, to support mission-critical ML and AI data workflows.
Requirements
- Deep expertise with Apache Airflow, including DAG design, performance optimization, and operational management
- Strong understanding of dbt for data transformation, including experience with testing frameworks and deployment strategies
- Experience with stream processing frameworks like Apache Flink or similar technologies
- Proficiency with SQL and Python for data transformation and pipeline development
- Familiarity with Kubernetes for containerized application deployment
- Experience implementing data quality frameworks and automated testing for data pipelines
- Knowledge of Git-based workflows and CI/CD pipelines for data applications
Responsibilities
- Design, build, and maintain our on-premises data orchestration platform using Apache Airflow, dbt, and Apache Flink
- Create self-service capabilities that empower teams across the organization to build and deploy data pipelines without extensive engineering support
- Implement robust data quality testing frameworks that ensure data integrity throughout the entire data lifecycle
- Establish data engineering best practices, including version control, CI/CD for data pipelines, and automated testing
- Collaborate with ML/AI teams to build scalable feature engineering pipelines that support both batch and real-time data processing
- Develop reusable patterns for common data integration scenarios that can be leveraged across the organization
- Work closely with infrastructure teams to optimize our Kubernetes-based data platform for performance and reliability
Other
- 5+ years of professional experience in data engineering, with at least 2 years working on enterprise-scale data platforms
- Mentor junior engineers and advocate for engineering excellence in data practices
- Ability to work cross-functionally with data scientists, ML engineers, and business stakeholders
- Experience with self-hosted data orchestration platforms (rather than managed services)
- Prior experience in a technical leadership role