NVIDIA DGX Cloud engineering has a mission to ensure our customers receive timely and quality-assured releases by providing a serverless generative AI infrastructure.
Requirements
- 5+ years of experience in developing devops tooling with a profound passion for automation
- Solid background in modern source control platforms (GitHub/GitLab)
- Strong experience in modern CI/CD technologies (Gitlab/testing frameworks/ArgoCD)
- Proficient in container-based infrastructure (Docker, Kubernetes, Helm)
- Comprehensive experience with Linux distributions (Ubuntu)
- Solid background in scripting languages (Bash, Python)
- Working background in higher level languages (golang)
Responsibilities
- Provide both development and operational tooling critical to DGX Cloud services
- Implement and operate services used by engineering, including first-level on-call / support
- Assist engineering by maintaining a well optimized & supported paved road SDLC, which includes working across engineering, testing and SRE to ensure tool alignment
- Ensure coverage of testing from unit testing to CI to smoke-testing to full end to end testing
- Provide developer environments that are easily updated with a low barrier to entry
- Develop and maintain continuous integration pipeline templates and testing frameworks
- Provide and operate continuous testing end-to-end integration environments
Other
- If you excel in problem-solving, can think creatively on your feet, and enjoy working in a distributed team setting, we would love to have you join us!
- Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field (or equivalent experience)
- Excellent written and verbal communication skills
- Experience in scaling devops practices across cross-functional teams
- Demonstrated ability to handle sophisticated technical environments while meeting or exceeding all security, reliability, scalability, and availability metrics