Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

CRUSOE Logo

Senior/Staff/Senior Staff Software Engineer

CRUSOE

$204,000 - $247,000
Sep 24, 2025
San Francisco, CA, USA • Sunnyvale, CA, US
Apply Now

Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.

Requirements

  • 7+ years in infrastructure engineering, including 3+ years operating Kubernetes in production
  • Strong experience running Kubernetes on bare metal (not just managed services)
  • Expert-level knowledge of Linux internals (cgroups, namespaces, kernel networking)
  • Deep experience with CNIs (Cilium, Calico), load balancers (Envoy, HAProxy, F5), and L3 networking (BGP, ECMP)
  • Proven track record provisioning and operating physical servers at scale (PXE/iPXE, Tinkerbell, MAAS, BMC/IPMI automation)
  • Strong programming skills in Go for building operators, controllers, and automation tooling
  • Hands-on experience with distributed storage systems (Ceph, MinIO, Rook, CSI drivers)

Responsibilities

  • Designing, building, and operating Kubernetes clusters on bare metal at scale
  • Engineering full cluster lifecycle management (Talos bootstrapping, upgrades, node reprovisioning, HA control planes, recovery workflows)
  • Architecting networking, load balancing, and service mesh solutions optimized for bare metal
  • Implementing performant CNIs (Calico, Cilium), integrating L2/L3 networking, routing (BGP/ECMP), and optimizing traffic across racks and datacenters
  • Automating provisioning via PXE/iPXE, Tinkerbell, MAAS, and managing BMCs/IPMI/Redfish with standardized BIOS/firmware across heterogeneous hardware fleets
  • Designing and operating persistent storage (local disks, block, object) including Ceph, Rook, and openEBS
  • Building automation and tooling (Go, Python, Bash) for provisioning, drift detection, upgrades, and incident response

Other

  • Mentoring engineers and shaping technical direction for Crusoe’s Kubernetes platform
  • Strong communication and collaboration across cross-functional teams
  • Experience with hardware fleet management across multiple datacenters
  • Contributions to open source Kubernetes or related ecosystem projects
  • Experience implementing disaster recovery strategies at scale