Microsoft Azure Specialized is looking to design and deliver the next generation of High Performance Computing (HPC) to enable a wide variety of customer workloads, ensuring the Azure platform is consistent on performance, can scale on-demand, and is engineered to withstand unparalleled computing demand.
Requirements
- coding in languages including, but not limited to, C, C++, PowerShell, or Python
- 2+ years of experience telemetry and observability, monitoring and improving the quality of a service or cloud infrastructure, or measuring and driving improvements in a system.
- 2+ years of experience in high performance computing, including familiarity with accelerators and co-designing hardware and software.
- 1+ year of experience working with distributed systems and cloud infrastructure, using profiling and performance analysis tools, and debugging low-level system code such as drivers.
- 1+ year of experience in data science and telemetry, with exposure to machine learning middleware and performance optimization techniques.
Responsibilities
- Designing and delivering the next generations of High Performance Computing (HPC)
- Defining, deploying and sustaining hardware and software Azure infrastructure for HPC workloads
- Focuses on hardware/software interaction, coding and playing with next-gen hardware
- End-to-end systems engineering anywhere in the infrastructure - - CPU differentiation, networking, switches, rack design, cluster design
- Evaluate and make recommendations that advance Azure infrastructure for HPC and AI-based workloads
- Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
- Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate.
Other
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
- passionate about quality, wants the customer to succeed and get things done.
- Maintains communication with key partners across the Microsoft ecosystem of engineers.
- Ensures alignment with partners' expectations.