Microsoft is looking to shape the future of the autonomous enterprise and lead the development of intelligent, agent-first experiences that transform how businesses operate by driving innovation at the intersection of AI, experimentation, and enterprise systems.
Requirements
- Prior expertise in natural language processing (NLP), with a foundation in large language model (LLM) development, evaluation, and fine-tuning.
- Experience in applying advanced fine-tuning techniques—including instruction tuning, reinforcement learning from human feedback (RLHF), and tool-augmented generation—to build agents capable of multi-step reasoning and decision-making.
- Familiarity with prompt/context engineering, context-aware orchestration, and integrating LLMs with external tools and APIs is essential.
- Comfortable working in a experimentation-driven environment, leveraging both offline and online evaluation methods to iterate rapidly and optimize agent behavior.
- A understanding of the challenges and opportunities in building AI-native enterprise applications will be key to success in this role.
- 2+ years developing and evaluating AI systems.
Responsibilities
- Design and evaluate autonomous agents that deliver measurable improvements in accuracy, latency, and cost-efficiency.
- Lead rapid experimentation cycles, develop robust evaluation frameworks, and apply advanced techniques like reinforcement learning to enable multi-step reasoning and decision-making.
- Collaborate across engineering, product, and partner teams to ensure agents are performant, secure, reliable, and extensible—empowering customers and partners to build on our platform.
- Deliver impactful solutions by executing high‑leverage data science and analytics initiatives within a product area or feature team, ensuring measurable improvements to user and business outcomes.
- Lead the design and implementation of advanced model fine‑tuning pipelines, including Reinforcement Learning from Human Feedback (RLHF), to align AI system behavior with user intent and improve performance in real‑world scenarios.
- Own, end‑to‑end projects that combine technical depth with cross‑functional collaboration, influencing feature direction and prioritization rather than broad organizational investment decisions.
- Develop and maintain robust measurement systems, experimentation frameworks, and causal inference methodologies tailored to dynamic AI systems and enterprise‑scale environments.
Other
- This role requires you to be onsite 3 days a week in Microsoft's offices in Redmond, WA.
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- 1+ year(s) experience creating publications (e.g., patents, peer-reviewed academic papers).
- Microsoft is an equal opportunity employer.