Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

NVIDIA Logo

Senior Engineer, Performance - Cloud Software

NVIDIA

$144,000 - $270,250
Sep 10, 2025
Santa Clara, CA, US
Apply Now

NVIDIA DGX Cloud engineering has a mission to ensure our customers receive timely and quality-assured releases. We are seeking a Performance Engineer proficient in performance and scalability testing, identifying limitations across the Kubernetes (K8s) and application stack using industry standard tools and telemetry.

Requirements

  • 5+ years in software engineering with a strong track record in performance or scalability of high-scale distributed systems
  • Are deeply comfortable with performance profiling tools and tracing systems
  • Be able to identify performance issues, root cause problems, and be able to come up with potential solutions
  • Experience optimizing performance across one or more layers of the stack (e.g., database, networking, storage, application runtime, GC tuning, Golang internals, GPU utilization)
  • Contributed to observability, benchmarking, or performance-focused infrastructure at scale
  • Strong understanding of OS internals, scheduling, memory management, and IO patterns
  • Proficient in container-based infrastructure (Docker, Kubernetes, Helm)

Responsibilities

  • Analyze and optimize performance across application, middleware, runtime, and infrastructure layers—networking, storage, GPU utilization, and beyond
  • Develop tooling and metrics that provide deep observability into system performance
  • Collaborate closely with infra, platform, runtime, and product teams to identify key performance goals and drive systemic improvements
  • Lead investigations into high-impact performance regressions or scalability issues in production
  • Influence architecture and design decisions to prioritize latency, throughput, and efficiency at scale
  • Drive performance testing strategies and help define SLAs/SLOs around latency and throughput for critical systems

Other

  • If you excel in problem-solving, can think creatively on your feet, and enjoy working in a distributed team setting, we would love to have you join us!
  • Have demonstrated success navigating ambiguity and aligning stakeholders around performance goals
  • Demonstrated ability to handle sophisticated technical environments while meeting or exceeding all security, reliability, scalability, and availability metrics
  • Strong and confirmed knowledge of modern architectures at scale
  • If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.