At Apple, the business problem is to create products that enrich people's lives, and the Ads Machine Learning Platform team’s mission is to help Ads teams develop, deploy, and operate innovative AI/ML applications efficiently and at scale.
Requirements
- Experience collaborating with ML researchers, data scientists, and product engineers on ML solutions.
- Expertise in data synthesis, fine-tuning, and data management for foundation models and LLMs, including multimodal workflows.
- Knowledge of data privacy and differential privacy in AI/ML systems.
- Practical experience developing or partnering on production ML models.
- Strong hands-on expertise with Java, Python, or Scala, and with data architecture, modeling, and SQL.
- Deep technical proficiency in data processing frameworks (Spark, Flink), streaming systems (Kafka), data lakes/warehouses (Iceberg, Delta Lake), databases (Cassandra, Redis), and workflow orchestration tools.
- Experience in both batch and real-time data processing, including CI/CD environments and cloud-native data systems.
Responsibilities
- Build and scale data management systems using technologies such as Spark, Iceberg, and Kafka to support AI/ML workloads.
- Develop data quality frameworks for automated validation, drift detection, and anomaly monitoring across training and production.
- Design production model monitoring systems to track data drift, model performance, and prediction quality in real time.
- Build training data services for LLMs, multimodal models, and classical ML use cases.
- Implement feature engineering and data processing tools to ensure consistent training and serving pipelines.
- Build and support A/B testing and experimentation platforms to measure model and feature performance.
- Develop annotation, labeling, and data augmentation pipelines to support model development and fine-tuning.
Other
- 6+ years leading engineering teams that build large-scale data infrastructure or ML platforms for enterprise environments.
- Proven ability to influence and foster collaboration across large, cross-functional teams.
- Demonstrated experience contributing to ML platforms supporting data pipelines, model training, serving, and monitoring.
- Strong analytical and problem-solving skills with a track record of data-driven architectural decisions.
- BS or equivalent experience in Computer Science, Data Engineering, Machine Learning, or a related field.