Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

DigitalOcean Logo

Senior GPU Infrastructure Engineer 2 - GPU Infrastructure

DigitalOcean

$168,000 - $190,000
Sep 26, 2025
Denver, CO, USA
Apply Now

DigitalOcean is looking to solve the problem of providing and supporting a rapidly growing Bare Metal GPU product by ensuring security, operational best practices, and reliable API capabilities for upstack service teams.

Requirements

  • Proven ability to orchestrate bare metal linux systems at scale including building automation for firmware updates, bios config management, configuring PXE environments.
  • Deep Linux systems experience including low level troubleshooting, developing and applying configuration management, security best practices and monitoring and alerting.
  • Expert knowledge in 1 or more orchestration tools such as MaaS, Salt, Chef, Ansible or Puppet.
  • Hands-on experience in High Performance Computing (HPC) clustered environments from Nvidia or AMD.
  • Experience in performing automated wide scale testing on NCCL or other frameworks.
  • Network engineering experience with VyOS platforms.

Responsibilities

  • Contribute to a rapidly growing Bare Metal GPU product within DO by providing security and operational best practices to a fleet of infrastructure servers across multiple regions.
  • Help design and implement further self-service capabilities for our customers by providing reliable and predictable API capabilities for upstack service teams.
  • Engage in support escalations when necessary.
  • Capture trends and lead internal projects to improve the overall product experience.
  • Continuously test our hardware platforms to identify performance regressions related to firmware, software or hardware issues.
  • building automation for firmware updates, bios config management, configuring PXE environments.
  • low level troubleshooting, developing and applying configuration management, security best practices and monitoring and alerting.

Other

  • Strong communication skills. Your job will involve writing detailed documentation for others to pick up or leading knowledge sharing sessions with operations teams.
  • This is a remote role
  • We value winning together—while learning, having fun, and making a profound difference for the dreamers and builders in the world.
  • If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you’ll find your place here.
  • As a member of the team, you will be a Shark who thinks big, bold, and scrappy, like an owner with a bias for action and a powerful sense of responsibility for customers, products, employees, and decisions.