Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Meta Logo

Production Systems Engineer

Meta

$132,000 - $191,000
Aug 31, 2025
Seattle, WA, US
Apply Now

Meta is seeking a Production Systems Engineer to join their Hardware Design and Release to Production - Sustaining (HDRTP) team to ensure the smooth operation of servers and data centers globally, focusing on hardware efficiency, performance, and reliability for AI platforms and large language models.

Requirements

  • Troubleshooting and data tooling, including data analysis, building analytical models, and visualizations
  • Knowledge of server architecture and components across Compute/Storage/AI Systems/Networking
  • Experienced in the integration of lab tools for automated workflows
  • Proficient in SQL, Python or C/C++ (data structures, algorithms, and OOP)
  • Experience with Linux systems and server systems management
  • Experience with some of the following modules/domains: PCIe, Networking, Flash, Memory, CPU, GPU, DRAM (DDR4/5 or HBM)

Responsibilities

  • Drive innovation in hardware efficiency by applying expertise in hardware utilization and performance, and translating insights into actionable strategies for hardware, power, performance and data center optimization
  • Contribute to industry leading research in hardware characterization and fleet/DC efficiency studies across AI platforms, leveraging data-driven and machine learning analytical techniques
  • Conduct in-depth hardware parameter based research and comparative analyses using advanced data analytics and machine learning techniques for failure analysis and diagnosis in production
  • Interface with internal hardware, software engineers and operations teams to understand system architectures and failure modes
  • Proactively create experiments, data analysis and data visualizations to detect and diagnose hardware health issues, focusing on systemic solutions
  • Collaborate on evolving AI platforms, silicon products, thermal and cooling solutions to support the growth of large language models, with a focus on optimizing performance, scalability, and efficiency
  • Develop data frameworks and discover insights to answer relationship between hardware, data center parameters and server failures

Other

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 6+ years of hands-on Software/Firmware/Hardware Engineering to build systems/products for the IT industry
  • Master’s degree or PhD in Computer Engineering, Electrical Engineering, or related field