Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

NVIDIA Logo

Senior Software Technical Program Manager - GPU Communication Libraries

NVIDIA

$192,000 - $304,750
Aug 20, 2025
Santa Clara, CA, US
Apply Now

NVIDIA is looking to develop pioneering compute software solutions for critically important environments, including leading academic institutions, start-ups, and industry, by leading and managing communication libraries like NCCL, NVSHMEM, UCX for Deep Learning and HPC.

Requirements

  • Hands on experience with software development for hardware platforms or communication runtime or high performance networking with demonstrated success in delivering these complex products to customers.
  • Proficiency in Agile software development methodologies.
  • Comprehensive understanding of software engineering principles, including experience with widely-adopted configuration management tools and productivity-enhancing tools and automation processes.
  • Background with parallel programming models (MPI, SHMEM) and at least one communication runtime (MPI, NCCL, NVSHMEM, OpenSHMEM, UCX, UCC).
  • Knowledge of a modern programming language is desired as well as depth in HPC and ML/DL fundamentals
  • Background with RDMA, high-performance networking technologies (InfiniBand, RoCE, Ethernet, EFA), network architecture and network topologies.
  • Solid understanding of the Deep Learning Framework ecosystem for Training and Inference

Responsibilities

  • Responsible for leading status meetings, proactively addressing challenges, customer concerns, and serving as primary POC for building and upholding prioritized release schedules and plans.
  • Strategically plan and partner across Nvidia teams to drive software objectives while maintaining schedules and formulating risk management strategies for risks identified across multiple parallel work streams.
  • Lead existing product development enhancements and software release processes, while collaborating with engineering management to optimize the development workflow and efficiency.
  • Translate customer requirements into actionable landmarks and tasks internally, ensuring customers are continually informed on issue statuses.
  • Drive Virtual reviews and establish continuous feedback loops by communicating benchmarking results and customer insights to product and engineering leadership.
  • Track and report large-scale performance benchmarking across all clusters. Build performance dashboards and reporting processes to monitor KPIs and surface performance trends
  • Collaborate across internal teams and third-party partners across time zones, as necessary, to resolve customer issues and oversee customer releases.

Other

  • BS, MS, or Ph.D. in CS, CE, EE (related technical field) or equivalent experience.
  • 12+ overall years of experience in the software industry with specialization in HPC networking or system software.
  • 6+ years program management experience in a similar or related role.
  • Strong communication and technical presentation skills and ability to work independently and actively with minimal guidance.
  • Previous experience coordinating activities between HW and SW organizations