Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Software Engineer

Microsoft

$119,800 - $234,700

Oct 31, 2025

Remote, US

Microsoft Azure High Performance Computing & AI Engineering (HPC & AI Eng) team is looking to manage the core platform & fleet of AI High Performance Computing products that customers use to run their most performant and demanding workloads.

Requirements

Coding in languages including, but not limited to, C, C++, C-Sharp, OR Java, JavaScript, or Python
Experience in operating AI/HPC systems, developing and running AI/HPC applications on clusters, or operating Cloud Infrastructure
Specialized experience with one of AI/HPC system management OR High-Speed Networks OR HPC Storage OR managing Cloud Infrastructure
Familiarity with the HPC software stack
Experience with cloud computing, virtualization, and container technologies
Experience running and troubleshooting machine learning workloads on Graphics Processing Unit (GPU)-based High Performance Computing (HPC) systems

Responsibilities

Collaborates with appropriate stakeholders to determine user requirements for a scenario.
Drives identification of dependencies and the development of design documents for a product, application, service, or platform.
Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI).
Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items.
Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate.
Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale.
Diagnosing & troubleshooting the largest scale supercomputing systems across the infrastructure stack (GPU hardware, networking, datacenter and core software)

Other

Bachelor's Degree in Computer Science or related technical field
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
4+ years technical engineering experience
Ability to work with a growth mindset, innovate to empower others, and collaborate to realize shared goals