Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Google Logo

Staff Software Engineer, Emerging On-prem AI Infrastructure

Google

$197,000 - $291,000
Dec 18, 2025
Kirkland, WA, US
Apply Now

Google needs software engineers to develop next-generation technologies that can handle information at massive scale and extend beyond web search, with a focus on integrated AI infrastructure systems and large-scale AI clusters.

Requirements

  • 8 years of experience programming in C++.
  • 5 years of experience testing, and launching software products.
  • 5 years of experience building and developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage, or hardware architecture.
  • 3 years of experience with software design and architecture.
  • Experience building cloud or systems level infrastructure spanning the entire hardware and software stack.
  • Experience in end-to-end diagnostics, troubleshooting, and supportability, with experience leading SWAT team efforts for complex issues and developing long term sustainable solutions.
  • Familiarity with Service Level Objectives (SLOs)/metrics measurement, logs/telemetry/metrics integration with tools for enhanced operator experience.

Responsibilities

  • Drive project success by setting the technical goal and roadmap.
  • Set priorities and projects for a team that delivers features in a fast-moving environment for both internal customers (other engineering teams) and external customers.
  • Ensure central responsibility is taken for diagnostics and troubleshooting of end-to-end supportability issues, to uncover and address complex technical problems, and the building of repair automation systems.
  • Implement and govern the success metrics for the team, spanning Operational Plane metrics (e.g., Support case metrics, GSO case handling), and RMA/Spares metrics (e.g., swap and repair rate).
  • Building large AI clusters using the latest technologies for AI acceleration and cluster interconnects and networking.
  • Developing large scale training and inference workloads, and optimizing performance.
  • Exposure to integrated AI infrastructure systems (GPU or TPU), from software to hardware design and workload management.

Other

  • Bachelor's degree or equivalent practical experience.
  • Ability to work in a changing environment and navigate ambiguity, and a track record of delivering solutions for subtle or complex technical problems.
  • Must be willing to work in Kirkland, WA, USA or Sunnyvale, CA, USA.
  • Google is an equal opportunity employer and welcomes applicants with diverse backgrounds and experiences.
  • Accommodations for applicants with disabilities are available upon request.