Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Google Logo

Senior Software Engineering Manager, TPU Performance

Google

$248,000 - $349,000
Nov 14, 2025
Sunnyvale, CA, US
Apply Now

Google is looking to solve the increasing complexity of ML model and ML accelerator architectures by enabling cost-effective performance and power of future ML systems through comprehensive analysis and HW/SW Co-Design. This involves fast iteration and innovation for ML system co-design and improvement, automated HW-friendly model improvement/enablement at scale, understanding of business-critical production ML models, and full stack ML hardware/software co-design with significant engineering velocity and results.

Requirements

  • 7 years of experience leading technical project strategy, ML design, and optimizing industry-scale ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
  • 5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
  • Experience focused on ML performance modeling and Improvements.
  • Experience on Large-Language Models (LLMs), ML framework and compiler.
  • Knowledge of performance analysis and experience in performance modeling of High-Performance Computing (HPC) interconnect topologies.
  • Knowledge of computer architecture (Tensor Processing Unit (TPU) or other accelerators).

Responsibilities

  • Explore and define future Machine Learning (ML) accelerator system and chip architecture with objective and data-driven ground truth.
  • Enable the cost effective peak performance of future ML systems with full stack ML Hardware/Software (HW/SW) co-design.
  • Establish understanding of the latest business-critical production ML models (Large-language models, large embedding models etc.) to inform improvements of model architecture, software system and hardware architecture.
  • Develop Simulator technologies to continuously keep up with evolving new system architecture choices and new ML workloads as well as supporting simulations at different abstraction levels.
  • Optimize your own code and make sure Engineers are able to optimize theirs.
  • Manage your project goals, contribute to product strategy and help develop your team.
  • Oversee the deployment of large-scale projects across multiple sites internationally.

Other

  • 8 years of experience with software development.
  • 5 years of experience in a technical leadership role; overseeing projects.
  • 5 years of experience in a people management, supervision/team leadership role.
  • Manage a team of 7 (eventually growing to 15) people.
  • Manage a large product budget.