Help Apple transform how machine learning and artificial intelligence systems are evaluated for quality and safety by leading teams building next-generation evaluation systems.
Requirements
- 3+ years experience developing ML evaluation systems
- 2+ years leading technical teams in ML/AI
- Expertise in full-stack LLM development and deployment
- Deep experience with LLM evaluation and automated assessments
- Track record scaling ML systems in production
Responsibilities
- Lead R&D in automated AI evaluation, including development of LLM-based assessment systems that can reliably evaluate model outputs
- Drive research and implementation of novel approaches to measure and improve AI system quality, safety, and alignment
- Build and scale evaluation infrastructure that combines human expertise with ML-powered automation
- Work with cross-functional partners to integrate evaluation systems into production workflows
- Develop groundbreaking approaches to AI assessment, including automated evaluation systems
- Pioneer new methods for scalable, high-quality AI evaluation
Other
- Master’s degree with 4+ years of industry experience, or PhD with 3+ years, or equivalent work experience
- Strong technical leadership and communication skills
- Strong product intuition and ability to identify high-impact opportunities
- Excellence in building and leading high-performing technical teams
- Hands-on manager who will thrive in a fast-paced environment