Microsoft's Artificial Intelligence Frameworks team is looking to solve the problem of running AI models efficiently across various devices and platforms, including servers, desktops, mobile phones, IoT devices, and internet browsers, with a focus on high-performance inference systems.
Requirements
- Coding experience in languages including, but not limited to, C, C++, C-Sharp, Java, JavaScript, or Python
- Experience with Large Language Models (LLMS) and large scale execution on AI workloads
- Technical design, problem solving, and debugging skills
- Experience with software development for specialized accelerators and host systems
- Knowledge of HW/SW interfaces
- Experience with cloud-scale AI workloads
- Familiarity with Microsoft's custom AI hardware
Responsibilities
- Software development in C++, Python, and other languages for specialized accelerators and host systems.
- Software design, development and optimization to execute AI workloads at the cloud scale
- Co-design with hardware partner teams on HW/SW interfaces.
- Design and code review of peer work.
- Lead benchmarking and optimization of state-of-the-art LLMs across GPUs and Microsoft’s custom AI hardware
- Architect improvements to large-scale serving pipelines
- Diagnose complex performance bottlenecks across distributed systems
Other
- Bachelor's Degree in Computer Science or related technical field
- 4+ years technical engineering experience
- Ability to meet Microsoft, customer and/or government security screening requirements
- Microsoft Cloud Background Check
- Master's Degree in Computer Science or related technical field (preferred)
- PhD Degree in Computer Science or related technical field (preferred)