Azure Core Compute Node Services is responsible for managing customer-facing compute resources as virtual machines, containers, and bare metal servers. The team is exploring forward-looking technologies, evolving the host OS, utilizing hardware offloads, and leveraging disaggregation. The goal is to ensure industry-leading availability, reliability, performance, compliance, and security for this mission-critical cloud platform.
Requirements
- coding in languages including, but not limited to, C, C++, C, Java, JavaScript, or Python
- distributed systems and managing cloud-based services
- Linux experience
Responsibilities
- Applies and identifies best practices and shares information with other engineers for building code based on well-established methods and secure design principles while also applying best practices for new code development and formal validation of security invariants.
- Leads product development and scaling to customer requirements and applies best practices for meeting scaling needs and performance expectations and security promises.
- Leads efforts on using debugging tools, tests, logs, telemetry, and other methods, and proactively leads verification of assumptions while developing code before issues occur across products in production.
- Leverages minimal telemetry data, triangulates issues, and resolves with minimal iterations.
- Leads incident retrospectives to identify root causes of problems, the implementation of repair actions, and the identification of mechanisms to prevent incident recurrence.
- Reviews product code and test code to ensure it meets team standards, contains the correct test coverage, and is appropriate for the product or solution area.
- Brings insight to code reviews to help improve code quality, coaching and providing feedback to develop other engineers' skills.
Other
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
- Holds accountability as a Designated Responsible Individual (DRI), mentoring engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions.