NVIDIA is looking for a Senior Software Engineer to build foundational systems for their DGX Cloud team, focusing on scalable cloud services that integrate with GPU telemetry in datacenters and enable operational automation across global cloud operations.
Requirements
- Expertise in building scalable REST APIs backed by PostgreSQL-compatible data stores.
- Proficiency in programming languages such as Go, Java, or Python.
- Familiarity with modern JavaScript frameworks (e.g., React, Angular, Next.js).
- Strong understanding of cloud infrastructure (AWS, GCP, Azure, etc) and container technologies like Docker and Kubernetes.
- Experience with high-scale distributed systems, including architectural patterns for APIs and data pipelines.
- Familiarity with Linux operating systems.
- A track record of delivering and managing high-performance cloud services at Internet scale.
Responsibilities
- Design and develop RESTful APIs to ingest telemetry from AI datacenters.
- Build scalable cloud services for high-volume ingestion, processing, and storage of large datasets.
- Build and manage data pipelines for online and offline data storage.
- Collaborate across teams to codify business processes into scalable, self-measuring systems.
- Optimize the reliability and efficiency of cloud services and operations.
- Lead and ship impactful technical projects, ensuring quality and scalability at every stage.
Other
- At least 6+ years of industry experience with a Bachelor’s degree (or equivalent experience); Master’s degree preferred.
- Outstanding communication and collaboration skills, with a focus on solving complex operational challenges.
- A passion for delivering scalable and efficient cloud services.
- Experience operating NVIDIA datacenter GPUs.
- Strong debugging and problem-solving skills in distributed environments.