Apple's AIML Data Engineering team needs to build scalable and reliable data platforms to power Siri, Search, and Machine Learning across Apple, handling petabytes of data and ensuring data quality for hundreds of millions of users.
Requirements
- 7+ years of experience designing, building, and maintaining distributed data processing systems at scale.
- 5+ years of hands-on experience with stream and/or batch processing technologies such as Flink, Spark, Kafka, Airflow, Iceberg, and Trino.
- Strong in algorithms, data structures, data modeling, and SQL, with experience working on large-scale, complex, and high-dimensional datasets.
- Proficient in at least one modern programming language (e.g., Java, Scala, and Python).
- Experience with machine learning algorithms or pipelines, particularly in the context of data engineering.
- Experience supporting ML engineers or data scientists with feature engineering or model data pipelines is a plus.
- Familiarity with testing tools and methodologies for validating large-scale, distributed data systems (e.g., data quality checks, pipeline testing frameworks, fault tolerance testing).
Responsibilities
- build the scalable and reliable data platform that powers Siri, Search, and Machine Learning across Apple
- build large-scale stream and batch processing data pipelines that power Analytics, Experimentation, and Machine Learning
- design a unified and groundbreaking data processing framework using Flink, and/or Spark
- optimizing performance, ensuring data quality, and contributing to a long-term vision that extends the framework’s capabilities to new user scenarios and groundbreaking machine learning applications
- collaborate closely with Siri, Search, and other teams to design solutions that transform raw data into datasets that drive innovation
- automate dataset lifecycles with strong quality standards
- help partners confidently use the data for product insights
Other
- collaborative and mission-driven software engineers
- care deeply about data quality, user impact, and building at scale
- passionate about tackling complex data challenges
- eager to work with petabytes of data
- inspired by Apple’s commitment to privacy and innovation