NVIDIA is looking for a TensorRT-LLM Software Development Engineer to build foundational inferencing software for deep learning-powered AI, including LLMs, ChatGPT, and Generative AI, which is crucial for NVIDIA's product lines and the industry.
Requirements
- Excellent Python programming and software design skills, including debugging, performance analysis, and test design.
- Experience working with deep learning frameworks like TensorFlow and PyTorch
- Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation
- Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance application
- Architectural knowledge of CPU and GPU
- GPU programming experience (CUDA or OpenCL)
Responsibilities
- Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance
- Perform benchmarking, profiling, and system-level programming for GPU applications.
- Closely follow academic developments in the field of artificial intelligence and feature update TensorRT
- Provide code reviews, design docs, and tutorials to facilitate collaboration among the team.
- Conduct unit tests and performance tests for different stages of the inference pipeline.
- Write safe, scalable, modular, and high-quality (Python) code for our core backend software for LLM inference.
- Improve the usability of the TensorRT-LLM library and build systems (CMake)
Other
- The ability to work on a fast-paced delivery-focused team is required
- excellent interpersonal skills are a must.
- Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience)
- 4+ years of relevant software development experience.
- Strong curiosity about artificial intelligence, awareness of the latest developments in deep learning like LLMs, generative and recommender models
- Self-starter who consistently takes initiative to drive projects forward
- Excellent written and oral communication skills in English