Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

NVIDIA Logo

Senior Software Engineer - Parallel Computing Systems

NVIDIA

$148,000 - $235,750
Aug 25, 2025
Santa Clara, CA, US
Apply Now

NVIDIA's nvFuser team is looking to build the next-generation fusion compiler that automatically optimizes deep learning models for workloads scaling to thousands of GPUs, impacting the future of AI compilation.

Requirements

  • CUDA kernel optimization
  • C++ systems programming
  • Compiler infrastructure
  • Parallel programming
  • Systems-level performance work
  • Advanced C++ programming with large codebase development, template meta-programming, and performance-critical code
  • Strong parallel programming experience with multi-threading, OpenMP, CUDA, MPI, NCCL, NVSHMEM, or other parallel computing technologies

Responsibilities

  • Design algorithms that generate highly optimized code from deep learning programs
  • Build GPU-aware CPU runtime systems that coordinate kernel execution for maximum performance
  • Master the latest GPU architectures
  • Develop innovative techniques for emerging AI workloads
  • Debug performance bottlenecks in thousand-GPU distributed systems
  • Influence next-generation hardware design
  • Push the boundaries of what's possible in AI compilation

Other

  • MS or PhD in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
  • Shown experience with low-level performance optimization and systematic bottleneck identification beyond basic profiling.
  • Performance analysis skills: experience analyzing high-level programs to identify performance bottlenecks and develop optimization strategies.
  • Collaborative problem-solving approach with adaptability in ambiguous situations, first-principles based thinking, and a sense of ownership.
  • Excellent verbal and written communication skills.