Apple's Siri team is looking to improve speech recognition models by leveraging distributed training and large datasets to build an intelligent assistant
Requirements
- Experience processing large, complex, unstructured data
- Knowledge of distributed data processing frameworks (Beam, Spark, Dask, Ray)
- Strong software engineering skills
- Machine Learning experience a plus
- Speech understanding or generation experience a plus
- Strong data engineering background in speech and/or language/text/dialogue processing field
Responsibilities
- Work with open source tools like PySpark, Jax, Ray and others
- Optimize how to move multi-modal data from various sources into complex model training pipelines
- Use open source models to extract signals from large volumes of speech data to drive modeling improvements
Other
- M.S. or Ph.D. degree in Computer Science, or equivalent experience
- Strong interpersonal skills to work well with engineering teams
- Excellent problem solving and critical thinking
- Ability to work in a fast-paced environment with rapidly changing priorities
- Passionate about building extraordinary products and experiences for our users