Developing novel audio/speech/language processing techniques and large multimodal models.
Requirements
- research experiences in natural language processing, speech processing, dialog system, or machine learning
- program skillfully in Python and/or C++
- experiences in using one of the leading deep learning toolkits
Responsibilities
- developing novel audio/speech/language processing techniques and large multimodal models
- work with researchers on a research project aimed at attacking one of the core problems by inventing cutting edge techniques
- establishing more effective post-training recipe
- finding more efficient large model architectures
- improving large model reasoning abilities
- building safe self-improving models/agents
- developing novel audio/speech processing technique, focusing on both text-only and multimodal scenarios
Other
- Ph.D. students in computer science, electrical engineering, mathematics or a related field
- self-motivated and excited about developing novel techniques
- good publication track records and history of creativity and intellectual flexibility
- Can start any time in the year 2025
- 3 months (with the possibility of extension)