The Azure Compute platform needs to build the world platform for running critical workloads with uninterrupted availability, reliability and scalability. The Azure Holmes team delivers dynamic resource management capabilities that enhance customer availability and platform efficiency through intelligent algorithms for optimal performance.
Requirements
- coding in languages including, but not limited to, C, C++, C, Java, JavaScript, or Python
- 1+ years experience with distributed systems
- 2 years of experience in distributed systems
- 2 yerars of Programming experience
Responsibilities
- designing and building highly available, event-driven microservices that elevate customer experience
- Collaborate with Microsoft Research to integrate cutting-edge ML/AI models
- Contribute to the evolution of a platform that powers mission-critical workloads at global scale
- Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
- Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate
- Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale
Other
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
- Collaborates with appropriate stakeholders to determine user requirements for a scenario.
- Drives identification of dependencies and the development of design documents for a product, application, service, or platform.
- Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items.