Microsoft is looking to reinvent productivity with AI, specifically through the Copilot for Calendar initiative, aiming to help users organize and manage time more effectively.
Requirements
- Experience developing and deploying large language models (LLMs), including agentic systems, supervised fine-tuning, and Reinforcement Learning (RLHF).
- Experience designing, implementing, and optimizing Retrieval-Augmented Generation (RAG) pipelines and advanced context engineering.
- Experience with modern LLM evaluation techniques, including LLM-as-a-Judge, agentic evaluations, and RAG assessments.
- Experience with MLOps practices, including model versioning, automated testing, monitoring, and CI/CD for machine learning.
- Provide expertise in building and scaling relevance and ranking systems, including experience with retrieval, embeddings, and evaluation methodologies tailored to LLM-powered applications.
- Demonstrate leadership in developing AI/ML solutions for productivity or assistant-like experiences, with a strong track record of managing cross-functional collaborations and driving measurable impact through data-driven product iteration.
- Experience with a top-tier scientific venues (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, KDD).
Responsibilities
- Keep abreast of the latest breakthroughs in generative AI and large-language models and translate them into practical, high-impact calendar Copilot solutions that leverage M365 graph to deliver personalized, context-aware solutions for time management.
- Define and iterate on relevance metrics that measure how Copilot features truly serve user intent in M365 Calendar.
- Determine where fine-tuned LLMs (large language models), small language models, or other specialized approaches are required—and own their design, training, and deployment.
- Build high-fidelity synthetic and manufactured datasets, along with rigorous evaluation sets and benchmarks that mirror the workflows of enterprise information-workers.
- Drive the applied-science strategy for industry-leading calendar agents, partnering closely with engineering and product teams to ship at scale.
- Stay at the cutting edge of NLP (natural language processing) and large language models. Apply techniques such as prompt engineering, fine-tuning, and retrieval-augmented generation (RAG) to enhance Copilot’s capabilities.
- Lead the design and development of advanced ML/NLP models to power Copilot features in Outlook Calendar and Microsoft Teams.
Other
- Build a clear, inspiring vision that aligns every team member around ambitious, measurable goals.
- Recruit and nurture diverse talent, fostering a culture of ownership, psychological safety, and relentless learning.
- Empower the team with resources and trust, then remove obstacles so they can execute rapidly and deliver outsized, sustainable impact.
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- Ability to translate complex ML concepts into business value and communicate technical insights to non-technical stakeholders.