Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

NVIDIA Logo

Senior Systems Software Engineer, AI Factory Operations

NVIDIA

$184,000 - $356,500
Oct 15, 2025
Santa Clara, CA, US
Apply Now

NVIDIA is looking to solve the problem of automating diagnosis and repair of GPU and CPU clusters on public clouds, private clouds, and virtual and physical hardware, and to enhance user experience and accelerate automation, diagnosis, and remediation of issues through AI capabilities.

Requirements

  • 10+ years of equivalent experience with at least 5 years with a systems programming language (Go/Rust/Java/C++) is ideal
  • Demonstrated ability in building scalable, agile, and robust distributed systems
  • Technical leadership and ownership of projects across the organization
  • Hands-on approach, passion for continuous improvement, and willingness to get involved in all aspects of development
  • Experience working with ambiguity and driving clarity in complex technical decisions
  • Skilled in using AI to scale team productivity and agility
  • Experience with SRE, DevOps, CI/CD, and a variety of platforms

Responsibilities

  • Making the existing cluster automation platform more fault-tolerant, agile, hardware/networking aware, and resource-efficient
  • Enabling AI capabilities in the platform to enhance user experience and accelerate automation, and diagnosis and remediation of issues
  • Integrating with the ecosystem tools to enable a rich, unified user experience with full end-to-end capabilities
  • Operating critical software services with high availability and reliability
  • Driving engineering best practices, mentoring engineers, and fostering an inclusive team culture
  • Collaborating with various stakeholders across NVIDIA to understand business context, influence the product roadmap, help with adoption of the automation platform, and reduce toil for managing clusters

Other

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience)
  • Keen interest in driving Agent AI projects
  • Successful product rollouts and collaboration with early adopters
  • Willingness to work with a diverse team and contribute to an inclusive team culture
  • Ability to work with competitive salaries and a generous benefits package