Microsoft Azure CXP team is looking to transform Microsoft Cloud customers into fans by analyzing and amplifying customer needs and driving the vision to improve Cloud quality, security, and reliability.
Requirements
- Familiarity with modern distributed software design patterns and cloud systems architecture, including microservices, containers, load balancing, queuing, caching.
- Experience in building, shipping, and operating reliable solutions.
- Experience with data technologies (SQL, NoSQL, etc.).
- Experience with Azure.
- Experience in AI adoption with tools like GitHub Copilot, Azure OpenAI, and custom Copilots to streamline development and reduce toil.
- Proficiency in languages including, but not limited to, C, C++, C, Java, JavaScript, or Python
- Bachelor's Degree in Computer Science, or related technical discipline with proven experience coding
Responsibilities
- Contributes to defining system reliability goals through Service Level Objectives (SLOs) and enhancing production posture with targeted improvements in observability and operability (telemetry, alerting, incident/change management, safe deployment practices).
- Builds reusable automation and processes that help multiple teams meet their reliability goals. With guidance, influences product architecture and roadmaps to ensure customer-experienced reliability is a core design principle.
- Works directly on product code to achieve reliability outcomes. Leverages AI to proactively detect anomalies, predict incidents, and automate operational workflows - scaling reliability efforts across complex systems.
- With guidance, supports the design and development of large-scale distributed software services and solutions. Delivers “best-in-class” engineering by ensuring services are modular, secure, reliable, testable, diagnosable, observable, and reusable.
- Collaborates with internal and external partners to support team goals. Balances pragmatism with vision - driving continuous improvements in process and codebase. Builds automation to prevent or remediate service issues before they impact users.
- Applies cutting-edge AI tools and techniques to reduce operational toil and scale reliability engineering across complex systems.
- Gains a working understanding of Microsoft businesses and contributes to cohesive, end-to-end user experiences.
Other
- Bachelor's Degree in Computer Science, or related technical discipline with proven experience coding
- Ability to meet Microsoft, customer and/or government security screening requirements
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
- Travel requirements not specified
- Clearance requirements not specified