Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Deep Learning Software Engineer, FlashInfer - New College Grad 2025

NVIDIA

$104,000 - $172,500

Aug 18, 2025

Santa Clara, CA, US

NVIDIA is looking to develop groundbreaking technologies in the inference systems software stack to accelerate AI inference and define the next era of computing

Requirements

Strong experience in developing or using deep learning frameworks (e.g. PyTorch, JAX, TensorFlow, ONNX, etc)
Strong Python and C/C++ programming skills
Background in domain specific compiler and library solutions for LLM inference and training (e.g. FlashInfer, Flash Attention)
Expertise in inference engines like vLLM and SGLang
Expertise in machine learning compilers (e.g. Apache TVM, MLIR)
Strong experience in GPU kernel development and performance optimizations (especially using CUDA C/C++, cuTile, Triton, or similar)

Responsibilities

Innovating and developing new AI systems technologies for efficient inference
Designing, implementing, and optimizing kernels for high impact AI workloads
Designing and implementing extensible abstractions for LLM serving engines
Building efficient just-in-time domain specific compilers and runtimes
Collaborating closely with other engineers at NVIDIA across deep learning frameworks, libraries, kernels, and GPU arch teams
Contributing to open source communities like FlashInfer, vLLM, and SGLang

Other

Bachelor's degree in Computer Science, Electrical Engineering, or related field (or equivalent experience); PhD are preferred
Travel requirements not specified
Must be eligible to work in the US
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer
Base salary will be determined based on location, experience, and the pay of employees in similar positions