NVIDIA needs to build the next generation of observability for a diverse set of sophisticated workloads to provide real-time understandings of their distributed infrastructure and improve the overall efficiency of their chip development process.
Requirements
- Familiarity with EDA (Electronic Design Automation) workflows and tools used in the semiconductor industry.
- Proficiency in programming and scripting using Python, Perl.
- Familiarity with databases, containerized applications, observability stack components.
- Experience in building data pipelines for a compute cluster using open-source technologies and building custom components as vital.
- Experience with C++ is a plus.
- Solid grasp of software engineering principles and methodologies such as OOP, CI/CD.
Responsibilities
- Collaborate closely with internal chip design teams to understand their workflows and determine observability needs to help improve the overall efficiency of our chip development process.
- Compose, build and maintain robust and scalable platforms and infrastructures for capturing, storing, visualizing and processing the data collected from chip build workflows.
- Maintain and update the observability tools and systems to meet the needs of new/evolving chip design workflows.
- Keep up to date with recent developments in the area related to observability tools, frameworks and strategies and advocate for their integration within the organization.
Other
- Candidates must hold a BS or above degree in Computer Science or equivalent experience
- Minimum 4+ years of professional experience developing and managing observability infrastructure.
- Excellent communication and collaboration skills.
- Ability to adapt in a fast-paced environment with evolving requirements.
- Ability to translate ambiguous problems into concrete solvable pieces.