Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Microsoft Logo

Senior ML Research Engineer - LLM Quantization & Model Optimization

Microsoft

$119,800 - $258,000
Sep 4, 2025
Redmond, WA, USA • Mountain View, CA, USA
Apply Now

Microsoft's Azure Hardware Systems and Infrastructure (AHSI) organization is looking for a Senior ML Research Engineer to innovate hardware designs and optimize LLM models for Microsoft's expanding cloud infrastructure and "Intelligent Cloud" mission.

Requirements

  • 2+ years of industry experience in low-precision model optimization and quantization for LLM workloads
  • Experience publishing academic papers as a lead author or essential contributor.
  • Experience participating in a top conference in relevant research domain.
  • Proven track record in developing production-scale software for model compression and performance optimization.
  • Proficient with deep learning frameworks such as PyTorch, TensorFlow, TensorRT, and ONNX Runtime.
  • In-depth understanding of Transformer and LLM architecture, including various model optimization techniques such as quantization, pruning, neural architecture search (NAS), knowledge distillation, sharding/parallelism, KV cache optimization, and FlashAttention.
  • Hands-on experience in setting up large scale evaluation framework for SOTA LLMs, fine tuning of large models.

Responsibilities

  • Design and develop novel quantization techniques to enable efficient deployment of LLM inference and training in Microsoft’s Azure production environments.
  • Drive software development and model optimization tooling proof-of-concept effort to streamline deployment of quantized models.
  • Analyze performance bottlenecks in state-of-the-art LLM architectures and drive performance improvements.
  • Prototype and evaluate emerging low-precision data formats through proof-of-concept implementations.
  • Co-design model architecture optimized for low-precision deployment in close collaboration with companywide AI teams.
  • Work cross-functionally with data scientists and ML researchers/engineers to align on model accuracy and performance goals.
  • Partner with hardware architecture and AI software framework teams to ensure end-to-end system efficiency.

Other

  • Doctorate in relevant field OR equivalent experience.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
  • Excellent communication skills and a team-oriented mindset.
  • Microsoft will accept applications for the role until September 19th, 2025.