Apple is looking for a Machine Learning Engineer with a strong background in Large Language Models (LLMs) to build the next generation ML evaluation frameworks and tools to automate large-scale data generation and evaluation job execution, build LLM judges, detect anomalies, and streamline ML evaluation workflows.
Requirements
- 3+ years of proven ability in machine learning, including hands-on work with LLMs.
- Strong programming skills in Python and experience with ML/NLP libraries
- Experience building or fine-tuning LLMs for software engineering tasks
- Understanding of prompt engineering, and retrieval-augmented generation (RAG)
- Experience developing LLM based automated evaluation frameworks
- Excellent knowledge of software testing methodologies & practices
- Experience in Swift/XCTest/XCUITest is preferred
Responsibilities
- Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as: - Automatic large scale data generation
-
- Automatic UI and Non UI test evaluation
-
- Run evaluation jobs at scale
-
- Build and optimize LLM judges
-
- Intelligent log summarization and anomaly detection
-
- Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications
-
- Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining
Other
- Collaborate with QA teams to integrate models into testing frameworks
- Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain
- Ability to thrive in a collaborative working environment within your team and beyond
- Ability to triage problems, prioritize accordingly, and propose resolutions