Apple Intelligence needs to innovate and apply state-of-the-art research in ML to tackle complex data problems, impacting future Apple products and the broader ML development ecosystem.
Requirements
- Demonstrated expertise in computer vision, natural language processing, and machine learning with a passion for data-centric machine learning.
- Deep understanding in multi-modal foundation models.
- Staying on top of emerging trends in generative AI and multi-modal LLM.
- Strong programming skills and hands-on experience using the following languages or deep learning frameworks: Python, PyTorch, or Jax.
- 3+ years of experience with developing and evaluating ML applications, and demonstrated experience in understanding and improving data quality.
- Strong publication record in relevant conferences (e.g. CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, , etc)
- Develop pipelines and tools to automate synthetic data generation for large-scale AI experiments.
Responsibilities
- Innovating and applying state of the art research in ML to tackle complex data problems.
- Actively participate in Apple Intelligence’s data-model co-design and co-develop practice.
- Design and development of a comprehensive data generation and curation framework for Apple Intelligence foundation models at Apple.
- Build robust model evaluation pipelines, integral to the continuous improvement and assessment of Apple Intelligence foundation models.
- Develop and implement techniques for creating high-quality synthetic datasets across a variety of domains, including vision, text, and audio data.
- Innovate and experiment with new approaches for synthetic data generation to improve the diversity, realism, and representativeness of datasets.
- Crafting and implementing semi-supervised, self-supervised representation learning techniques for growing the power of both limited labeled data and large-scale unlabeled data.
Other
- Collaborating with our machine learning researchers, engineers, and data scientists.
- Collaborate with multi-functional teams to understand data requirements and ensure that synthetic datasets are optimized for training foundation models.
- Showcase your groundbreaking research work by publishing and presenting at premier academic venues.
- Strong problem-solving and communication skills.
- Ph.D/MS degree in Machine Learning, Natural Language Processing, Computer Vision, Data Science, Statistics or related areas