Apple's Video Engineering Data Analytics and Quality (DAQ) group is looking to solve problems related to evaluating machine learning and multi-modal large language models (MM-LLM) to ensure they meet accuracy, robustness, and usability standards.
Requirements
- Proven background in data science, machine learning, computer vision and statistical data analysis.
- Advanced programming skills in data manipulation & processing (SQL & Python preferred).
- Demonstrated experience in in-depth analysis of machine learning model failures.
- Experience crafting, conducting, analyzing, and interpreting experiments and investigations.
- Expertise in data wrangling and developing data visualizations & reporting with toolings such as Tableau, Superset, AWS etc.
- Experience working with multi-modal foundation models such as GPT-4o, Gemini 2.5, Claudi 3/4, LLaVA, Flamingo, etc.
- Familiar with machine learning interpretability method and standard processes.
Responsibilities
- Evaluate ML & MM-LLM Models: Analyze and validate computer vision, multi-modal, and large language models to ensure they meet accuracy, robustness, and usability standards.
- Develop Metrics: Design and implement metrics to measure the efficiency and accuracy of models.
- Failure Analysis: Conduct in-depth analysis on model failures across CV and MM-LLM pipelines to surface root causes and improvement areas.
- Data Processing: Clean, transform, and curate large-scale datasets for model evaluation and benchmarking.
- Model Optimization: Apply innovative techniques to optimize models for scalability and real-world deployment.
- Collaborate multi-functionally: Work closely with cross-functional teams, including software engineers, product managers, and other data scientists, to integrate models into production.
- Communicate Results: Present findings clearly and effectively to collaborators across levels of technical understanding.
Other
- BS and a minimum of 3 years relevant industry experience
- Curious, self-motivated, and able to drive improvements to model evaluation pipelines and annotation programs.
- Outstanding communication skills both written and verbal with experience presenting to leadership.
- Detail-oriented to keep track of and understand the workings of sophisticated algorithms.
- Strong attention to detail in working with large datasets and complex ML systems.