Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Super Micro Computer Logo

Reliability Engineer

Super Micro Computer

$80,000 - $120,000
Aug 28, 2025
San Jose, CA, USA
Apply Now

Supermicro is seeking to ensure system-level robustness, long-term stability, and thermal reliability of high-performance server platforms through reliability validation, with a specific focus on CPU validation and environmental stress testing.

Requirements

  • Strong hands-on experience with server hardware (e.g., CPU sockets, heatsinks, VRMs, DIMMs) and system-level validation.
  • Proficient in using thermal chambers, power cycling tools, and monitoring utilities (IPMI, sensors, thermal cameras, etc.).
  • Familiarity with industry-standard reliability methodologies
  • Experience with BIOS configuration, firmware tools, and OS-based stress testing (e.g., Prime95, BurnInTest, LINPACK).
  • Experience with automated test environments and scripting (Python, Bash, or PowerShell).
  • Background in validation of high-end server CPU platforms (Intel, AMD, or ARM-based).
  • Prior experience maintaining or creating reliability SOPs and validation dashboards.

Responsibilities

  • Develop and execute reliability test plans, including thermal, voltage, and long-duration stress testing.
  • Monitor system health (e.g., error logs, temperature sensors) and analyze failures to determine root cause.
  • Conduct CPU validation on a variety of motherboard and system configurations
  • Maintain and calibrate thermal chambers, power cycling equipment, and automated stress platforms to ensure consistent test results.
  • Coordinate closely with platform engineering, BIOS, hardware design, and quality teams to align on test coverage and resolve cross-functional issues.
  • Document and maintain SOPs for test setups, execution, and reporting; ensure compliance with internal and industry test standards.
  • Manage test schedules and resources (e.g., CPU samples, chambers, power equipment) to ensure validation milestones are met.

Other

  • Bachelor’s or Master’s degree in EE, CE, or a related technical field.
  • 1-2 years of experience in hardware validation, with a focus on CPU, system reliability, or stress testing.
  • Effective communication skills for reporting results and collaborating across teams.
  • Travel requirements not specified
  • Must be eligible to work in the US, visa requirements not specified