NVIDIA is looking for an engineer to support NVIDIA's GPU accelerated platforms in AI Factories by working directly with customers to resolve hardware platform issues and AI/ML workloads on rack-scale platforms, and contribute to products and software tooling.
Requirements
- At least 5+ years of engineering experience with multi-GPU platforms
- Strong system software (firmware, BIOS, kernel, driver, operating system) expertise
- Solid understanding of Linux and the ability to analyze, optimize, and customize Linux environments for AI/ML workloads.
- Containerized solutions experience with Docker, Kubernetes, Slurm
- Proficient in C/C++ programming of platform OS, firmware, BIOS, kernel, drivers
- Proficient in Python programming with the ability to build custom tools
- Background with parallel programming or GPU acceleration (e.g., CUDA)
Responsibilities
- Provide direct support to our NVIDIA Enterprise customers and work to answer questions, reproduce, resolve, or advance customer issues.
- Work with engineering teams on customer issues, providing logs, reproduction information, and other triage information.
- Create/update product and/or support tools.
- Take ownership and drive customer issues from inception to resolution.
- Document customer interactions and better enhance our knowledge base.
- Develop features and tools as part of solution engineering efforts to support NVIDIA technologies
Other
- Occasional work on weekends and holidays to support customers
- Professional-level communication skills, including adjusting communication to the technical level of the audience, and staying calm and focused in negative situations.
- Excellent follow-up and organizational skills, with a passion or love for solving problems.
- Minimum of a BS in Computer Engineering, Electrical Engineering, or equivalent experience.
- Applications for this job will be accepted at least until September 19, 2025.