Microsoft is looking to redefine the PC with Copilot+ AI-powered capabilities by building next-generation visual/screen models that are intelligent, secure, and efficient, aiming for a faster, more intuitive, and human-centric computing future.
Requirements
- Solid experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and model compression techniques.
- Experience with model evaluation metrics and benchmarking across platforms.
- Prior experience training models with custom architectures and tuning architectures to optimize for inference time.
- Proven track record in quantization, performance tuning, and deploying models on edge devices.
- Familiarity with silicon-specific optimization strategies.
- Hands-on experience with Qualcomm QNN, Intel OpenVINO, or similar toolchains.
Responsibilities
- Be part of a team contributing to pre- and post-training efforts for visual models pertaining to screen understanding.
- Explore and implement Transformer-based architectures and suggest architectural improvements for efficiency and scalability.
- Use techniques like distillation, adapters, and Low Rank Adapters (LoRAs) to build upon existing models.
- Optimize models for on-device inference using techniques such as quantization and debugging models.
- Conduct performance tuning and model evaluation across diverse silicon platforms (e.g., Neural Processing Units--NPUs, GPUs, custom accelerators).
- Collaborate with cross-functional teams to integrate models into production pipelines and validate performance in real-world scenarios.
- Contribute to internal model libraries and tooling for deployment across multiple hardware toolchains.
Other
- Work with a cross-functional team to bring vision and UI-centric models to life.
- Embody our Culture and Values.
- Microsoft will accept applications for the role until September 14, 2025.