Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

NVIDIA Logo

Senior Deep Learning Software Engineer, Inference

NVIDIA

$148,000 - $287,500
Sep 5, 2025
Santa Clara, CA, US
Apply Now

NVIDIA seeks to design, build, and optimize GPU-accelerated software for AI applications, focusing on high-performance deep learning frameworks like SGLang and vLLM for efficient large-scale model serving and inference.

Requirements

  • excellent C/C++ programming and software design skills.
  • GPU programming experience (CUDA, OAI TRITON or CUTLASS) is a plus.
  • Prior background with performance modeling, profiling, debug, and code optimization or architectural knowledge of CPU and GPU is a plus.
  • Prior experience with training, deploying or optimizing the inference of DL models in production is a plus.
  • Experience with Multi GPU Communications (NCCL, NVSHMEM)

Responsibilities

  • Performance optimization, analysis, and tuning of DL models in various domains like LLM, Multimodal and Generative AI.
  • Scale performance of DL models across different architectures and types of NVIDIA accelerators.
  • Contribute features and code to NVIDIA’s inference libraries, vLLM and SGLang, FlashInfer and LLM software solutions.
  • Work with cross-collaborative teams across frameworks, NVIDIA libraries and inference optimization innovative solutions.

Other

  • Masters or PhD or equivalent experience in relevant field (Computer Engineering, Computer Science, EECS, AI).
  • 5+ years of relevant software development experience.
  • SW Agile skills are helpful and Python experience is a plus.
  • creative and autonomous engineer with a genuine passion for technology