Advance the latest ML techniques, shape the datasets that power Siri, and dive into the details to craft experiences for millions of users worldwide by enriching conversation understanding through LLM and multimodal models.
Requirements
- 5-7 + years experience in Machine Learning applied to Speech & NLP.
- Hands-on experience training LLMs, adapting pre-trained LLMs for downstream tasks, and leveraging curated datasets or human-feedback data.
- Familiarity with large-scale data collection/processing (e.g., annotation guidance, data quality checks).
- Experience delivering ML capabilities into great products.
- Proficiency with ML toolkits such as PyTorch.
- Strong programming skills in Python and either C or C++.
- JAX
Responsibilities
- working with existing data pipelines and annotation teams to collect, clean, and analyze speech and multimodal datasets.
- Training large language and multimodal models on distributed back-ends, and adapting them for on-device deployment.
- Ensuring quality with an emphasis on data and model robustness.
- enriching conversation understanding through LLM and multimodal models-now with added attention to the data that makes these models excel.
Other
- Interact closely with other ML researchers, software engineers, and design teams.