NVIDIA's Deep Learning Libraries Group is seeking to design and develop scalable, modular infrastructure that streamlines development, builds, and testing across NVIDIA's diverse set of platforms to enable the next wave of NVIDIA's highest performing deep learning libraries and stay ahead of the competition.
Requirements
- Strong familiarity with Python (or similar) and experience with building C/C++ codebases
- System administration experience maintaining both Linux and Windows systems
- Experience setting up, maintaining, and automating continuous integration systems
- Experience designing and developing automation in Jenkins with Groovy (or similar)
- Background with distributed systems and cluster/cloud computing, especially with Kubernetes
- Knowledge of GPU computing systems
- Experience with mobile/embedded platforms and multiple operating systems (Ubuntu, RedHat, Windows, QNX, or similar)
Responsibilities
- Designing and developing software for testing and analysis of our codebases
- Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries
- Developing throughout the software stack, from the user experience down to the cluster and database layers
- Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Gitlab, etc.)
- Enabling new platforms, which may include preparing hardware for testing and enabling testing in automation for new platforms
Other
- BS or equivalent experience or higher degree in Computer Science or Computer Engineering
- 5+ years of relevant experience
- A pragmatic approach to solving problems and collaboration
- Are you creative, driven, and autonomous?
- Do you love a challenge?