NVIDIA is looking to drive innovation in deep learning and enable support in TensorRT for evolving hardware capabilities, ensuring the company remains synonymous with innovation.
Requirements
- Strong C++ skills, including knowledge of and application of best practices with C++11 and C++14.
- Familiarity with deep learning concepts and frameworks.
- Proficiency with Python and/or CUDA, ideally with experience in a professional environment.
- Background with systems programming, embedded systems, and/or compiler development.
- Experience in software performance benchmarking, profiling, and optimizations.
- Experience with state-of-the-art deep learning models (such as Large Language Models) & frameworks for inference.
- Background with C++17.
Responsibilities
- Orchestrate the integration of new hardware functionalities into TensorRT's compiler and runtime.
- Work closely with teams and stakeholders across the whole hardware and software stack to understand and leverage new features to improve TensorRT’s functionality and performance.
- Guide the design and implementation of robust, high-quality C++ code in alignment with Modern C++ standards.
- Contribute to the continuous improvement of software practices and processes within the team.
Other
- Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, Electrical Engineering, AI).
- At least 8 years of relevant software development experience.
- A track record of taking initiative and driving projects to completion.
- Excellent interpersonal skills and a collaborative, pragmatic approach to solving problems.
- Travel requirements not specified, but position is listed as LI-Hybrid, indicating potential for remote work.