Apple Ads group needs to help users discover new content and support publishers/developers in promoting and monetizing their work. The Ads Machine Learning Platform team's mission is to help Ads teams develop, deploy, and operate innovative AI/ML applications efficiently and at scale by guiding the development of data foundations that power AI/ML initiatives.
Requirements
- Strong hands-on expertise with Java, Python, or Scala, and with data architecture, modeling, and SQL.
- Deep technical proficiency in data processing frameworks (Spark, Flink), streaming systems (Kafka), data lakes/warehouses (Iceberg, Delta Lake), databases (Cassandra, Redis), and workflow orchestration tools.
- Experience in both batch and real-time data processing, including CI/CD environments and cloud-native data systems.
- Demonstrated experience contributing to ML platforms supporting data pipelines, model training, serving, and monitoring.
- Strong understanding of AI/ML data management, including handling unstructured data, dataset versioning, and training data quality at scale.
- Hands-on experience building model monitoring and observability systems for drift detection, model degradation, and real-time prediction quality.
- Familiarity with annotation and labeling workflows, as well as generative AI techniques such as transformer architectures, diffusion models, and multimodal learning.
Responsibilities
- Build and scale data management systems using technologies such as Spark, Iceberg, and Kafka to support AI/ML workloads.
- Develop data quality frameworks for automated validation, drift detection, and anomaly monitoring across training and production.
- Design production model monitoring systems to track data drift, model performance, and prediction quality in real time.
- Build training data services for LLMs, multimodal models, and classical ML use cases.
- Implement feature engineering and data processing tools to ensure consistent training and serving pipelines.
- Build and support A/B testing and experimentation platforms to measure model and feature performance.
- Develop annotation, labeling, and data augmentation pipelines to support model development and fine-tuning.
Other
- 6+ years leading engineering teams that build large-scale data infrastructure or ML platforms for enterprise environments.
- Proven experience designing multi-use platform services and influencing cross-team technical roadmaps.
- Proven ability to lead teams delivering mission-critical production services with high reliability and operational excellence.
- Experience working closely with operations teams on deployment, monitoring, and system reliability.
- Strong analytical and problem-solving skills with a track record of data-driven architectural decisions.