Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Performance Software Engineer, Deep Learning Libraries

NVIDIA

$184,000 - $356,500

Oct 20, 2025

Santa Clara, CA, US

NVIDIA is looking to solve the problem of optimizing deep learning operations on NVIDIA GPUs by developing high-performance code for cuDNN, cuBLAS, and TensorRT libraries to accelerate deep learning models.

Requirements

Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test design
Experience with performance-oriented parallel programming, even if it’s not on GPUs (e.g. with OpenMP or pthreads)
Solid understanding of computer architecture and some experience with assembly programming
Tuning BLAS or deep learning library kernel code
CUDA/OpenCL GPU programming
Numerical methods and linear algebra

Responsibilities

Writing highly tuned compute kernels, mostly in C++ CUDA, to perform core deep learning operations (e.g. matrix multiplies, convolutions, normalizations)
Following general software engineering best practices including support for regression testing and CI/CD flows
Collaborating with teams across NVIDIA: CUDA compiler team on generating optimal assembly code
Collaborating with teams across NVIDIA: Deep learning training and inference performance teams on which layers require optimization
Collaborating with teams across NVIDIA: Hardware and architecture teams on the programming model for new deep learning hardware features

Other

Masters or PhD degree or equivalent experience in Computer Science, Computer Engineering, Applied Math, or related field
6+ years of relevant industry experience
Travel requirements not specified
Visa requirements not specified
Degree requirements: Masters or PhD degree or equivalent experience