Microsoft Research is looking to solve complex challenges in diverse fields by developing and implementing cutting-edge technology in multimodality, specifically focusing on physics-informed multimodal learning, generation, and multimodal foundation model training across various modalities like vision, text, audio, and code.
Requirements
- Solid programming skill in Python, PyTorch
- Experience in LLM pre-training, post-training and inference
- Reinforcement Learning experiences and frameworks (like VeRL, rLLM)
- Relevant publication such as CVPR, ACL, ICML, ICCV, ECCV, NeurIPS, ICLR, EMNLP etc.
Responsibilities
- developing and implementing cutting-edge technology multimodality
- physics informed multimodal learning
- generation and multimodal foundation model training
- across typical modal vision, text, audio, code and system
- explore the next generation of multimodal learning and generation paradigm
- unlock the new intelligence capabilities reside in multimodality
- conducting experiments and writing papers
Other
- Currently enrolled in a master or PhD program in Computer Science, Electrical Engineering, Mathematics or a related field.
- Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship.
- submit a minimum of two reference letters for this position.
- submit a cover letter and any relevant work or research samples.
- Ability to work independently and collaboratively in a dynamic and vibrant research environment.
- Willingness to embrace knowledge/technique outside your field of research.