NVIDIA's Enterprise SWQA team is looking for a Senior Software Development Engineer in Test to join their Confidential Computing team. The role involves developing new features, automation, and test infrastructure to ensure the quality and reliability of NVIDIA's compute software releases on various platforms, including GPUs and integrated systems. The team aims to leverage AI tools to enhance testing capabilities and streamline operations for more efficient and accurate results.
Requirements
- Solid understanding of embedded systems, Linux, Python, C and C++.
- Experience with Hypervisors is a big plus along with focus on cloud infrastructure, platform security, or highly regulated deployment environments.
- Proven experience with AI tools for automation and test plan development directly applied to daily tasks. This expertise is crucial for enhancing performance, developing robust frameworks, and increasing test coverage.
- Strong technical skills, with deep understanding of orchestration & automation systems, data centers and cloud architecture combined.
- Knowledge in Cluster and cluster management.
- Experience in developing test strategies, high quality test plans and test execution
- Proficient in building test setups and fine tuning in HW and SW
Responsibilities
- Develop test plan and orchestrate testing for Compute software releases on all new compute architecture platforms including Tesla GPUs, NVIDIA turnkey systems and OEM systems.
- Develop a robust test infrastructure incorporating advanced AI tools to significantly enhance our testing capabilities and streamlining operations for more efficient and accurate results.
- Improve code coverage, elevating the overall quality of our codebase and reliability of our testing processes and develop roadmaps prioritizing software development schedule for full life-cycle of tool development, test, and deployment
- Build and operate key pieces of a complete infrastructure for automation framework development, as well as, lead and develop automation support and participate in automation of manual test cases, working closely with automation infrastructure
- Test both software functionality and internal code/structure and run regression tests for existing CUDA/Driver features.
- Apply AI-powered tools to improve efficiency and quality, including test case/plan/script generation, defect detection, CBTP, bug fixing and day to day assistance
- Experience with Configuration and deployment management (Ansible), Containers (Docker) and Virtualization infrastructure software (Xen, KVM,Hyper-V)
Other
- 7+ years testing SW development cycle.
- Work in a dynamic agile software development team with very high production quality standards.
- Good understanding of C/C++ toolchain in Linux including cross-compilation (C, C++, automake/autoconf, cmake, meson).
- Background with parallel programming, ideally CUDA C/C++ and OpenACC
- If you're creative and autonomous, we want to hear from you!