Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Software Engineer - Parallel Computing Systems

NVIDIA

$148,000 - $235,750

Aug 25, 2025

Santa Clara, CA, US

NVIDIA's nvFuser team is looking to build the next-generation fusion compiler that automatically optimizes deep learning models for workloads scaling to thousands of GPUs, impacting the future of AI compilation.

Requirements

CUDA kernel optimization
C++ systems programming
Compiler infrastructure
Parallel programming
Systems-level performance work
Advanced C++ programming with large codebase development, template meta-programming, and performance-critical code
Strong parallel programming experience with multi-threading, OpenMP, CUDA, MPI, NCCL, NVSHMEM, or other parallel computing technologies

Responsibilities

Design algorithms that generate highly optimized code from deep learning programs
Build GPU-aware CPU runtime systems that coordinate kernel execution for maximum performance
Master the latest GPU architectures
Develop innovative techniques for emerging AI workloads
Debug performance bottlenecks in thousand-GPU distributed systems
Influence next-generation hardware design
Push the boundaries of what's possible in AI compilation

Other

MS or PhD in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
Shown experience with low-level performance optimization and systematic bottleneck identification beyond basic profiling.
Performance analysis skills: experience analyzing high-level programs to identify performance bottlenecks and develop optimization strategies.
Collaborative problem-solving approach with adaptability in ambiguous situations, first-principles based thinking, and a sense of ownership.
Excellent verbal and written communication skills.