Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

NVIDIA Logo

Senior System Software Engineer, Enterprise MODS

NVIDIA

$224,000 - $425,500
Sep 9, 2025
Santa Clara, CA, US
Apply Now

NVIDIA's data center platforms like GB200 NVL72 are redefining AI, HPC, and cloud computing. To accommodate leading workloads globally, their diagnostic systems need to evolve across diverse hardware technologies. NVIDIA is looking for a technical leader to engineer and propel innovation in diagnostics for their partner ecosystem, essential in crafting how they validate, debug, and optimize complex server platforms across ODM factories, Cloud Service Provider (CSP) deployments, and field operations.

Requirements

  • Proven experience architecting diagnostics for complex server systems, especially at the SW/HW interface.
  • Deep systems knowledge: x86/ARM architectures, Linux/Windows OS internals, firmware (UEFI/BIOS), BMC, and platform security.
  • Expertise in programming languages like C, C++, and Python for tool development and automation.
  • Familiarity with high-speed interconnects such as PCIe, Infiniband, NVLink, and Ethernet.
  • Experience driving diagnostics across rack-level or cluster-level deployments.
  • Background in cloud-scale infrastructure and partner engagement.
  • 12+ years of engineering experience in diagnostics, embedded systems, or cloud platforms.

Responsibilities

  • Develop diagnostic systems for NVIDIA data center platforms, which involve hardware and software tools to develop the worst case stress workloads for CPUs, GPUs, memory, storage, and interconnects.
  • Lead platform bring-up and integration, ensuring diagnostics are embedded early and effectively across the server lifecycle.
  • Drive hardware validation strategy in collaboration with architecture and hardware teams, crafting robust validation plans for new server generations.
  • Analyze root causes of complex failures, acting as a Level 2 engineering contact for critical issues and offering scalable solutions across the stack.
  • Develop diagnostics software to ensure quality and performance at scale across ODM and partner production lines.
  • Mentor and grow engineering teams, providing technical leadership and encouraging a culture of innovation and excellence.
  • Influence the long-term strategy by developing diagnostic architecture and roadmaps for the upcoming products of NVIDIA and its partners.

Other

  • Ability to weigh tradeoffs in system development and drive the most optimum solutions with customers and multi-disciplinary teams
  • Strong communication skills to engage with technical and executive team.
  • BS/MS or equivalent experience in Computer Science, Electrical Engineering, or related field.
  • Demonstrated success in influencing product direction and vendor roadmaps.
  • Passion for mentoring and building high-performing teams.