Microsoft Azure AI Inference platform is looking to address the growing AI market by providing a fully managed AI Inference platform to accelerate the research, development, and operations of AI powered intelligent solutions at scale. The CoreAI Inferencing team is responsible for hosting, optimization, and scaling the inference stack for all Azure AI Foundary models, including those from OpenAI, Grok, DeepSeek, and other OSS models, serving billions of inferences per day.
Requirements
- 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C, Java, or Golang
- 4+ years’ practical experience working on high scale, reliable online systems
- Technical background and foundation in software engineering principles, distributed computing and architecture
- Experience in real-time online services with low latency and high throughput
- Experience working with L7 network proxies and gateways
- Knowledge in Network architecture and concepts (HTTP and TCP Protocols, Authentication and Sessions etc)
- Knowledge and experience in OSS, Docker, Kubernetes, C++, Golang, or equivalent programming languages
Responsibilities
- Lead the design and implementation of core inference infrastructure for serving frontier AI models in production.
- Identify and drive improvements to end-to-end inference performance and efficiency of OpenAI and other state-of-the-art LLMs.
- Lead the design and implementation of efficient load scheduling and balancing strategies, by leveraging key insights and features of the model and workload.
- Scale the platform to support the growing inferencing demand and maintain high availability.
- Deliver critical capabilities required to serve the latest and greatest Gen AI models such as GPT5, Realtime audio, Sora, and enable fast time to market for them.
- Drive generic features to cater to the needs of customers such as GitHub, M365, Microsoft AI and third-party companies.
- Mentor engineers on distributed inference best practices.
Other
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
- Ability to independently lead projects
- Embody Microsoft's Culture and Values