Microsoft Applied Sciences Group is looking to solve the problem of bringing applications of machine learning to millions of users, including developing and implementing state-of-the-art AI algorithms for specific and general-purpose silicon on next generation devices and operating systems.
Requirements
- 2+ years of experience in deep learning with familiarity with Language models, transformers like BERT, GPT-2/GPT-3, Llama, OPT etc.
- 2+ years' experience optimizing large language models, model compression, network distillation into small language models.
- 2+ years' experience with Python.
- Experience in model quantization & optimization techniques such as GPTQ, LORA etc.
- Experience with model conversion and deployment frameworks like ONNX is a plus.
- Experience with other model frameworks like Pytorch or Tensforflow etc is a plus.
Responsibilities
- Optimize, fine tune and transform models for edge device inferencing.
- Contribute to the technical design, architecture, development, and evaluation of DNN models.
- Collaborate with engineering and product development teams.
- Contribute to a real-time system involving multiple components.
- Assist in identifying and addressing issues in ML frameworks and associated hardware.
- Mentor junior engineers and contribute to team knowledge-sharing sessions.
Other
- Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 2+ years related experience with Machine Learning (ML), Natural Language Processing (NLP), Large Language Models (LLM).
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- Demonstrated ability and passion for incubating new ideas, solving problems, and building working systems.
- Ability to work in a team environment and collaborate with others.