Google is looking to solve the increasing complexity of ML model and ML accelerator architectures by enabling cost-effective performance and power of future ML systems through comprehensive analysis and HW/SW Co-Design. This involves fast iteration and innovation for ML system co-design and improvement, automated HW-friendly model improvement/enablement at scale, understanding of business-critical production ML models, and full stack ML hardware/software co-design with significant engineering velocity and results.
Requirements
- 7 years of experience leading technical project strategy, ML design, and optimizing industry-scale ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
- 5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
- Experience focused on ML performance modeling and Improvements.
- Experience on Large-Language Models (LLMs), ML framework and compiler.
- Knowledge of performance analysis and experience in performance modeling of High-Performance Computing (HPC) interconnect topologies.
- Knowledge of computer architecture (Tensor Processing Unit (TPU) or other accelerators).
Responsibilities
- Explore and define future Machine Learning (ML) accelerator system and chip architecture with objective and data-driven ground truth.
- Enable the cost effective peak performance of future ML systems with full stack ML Hardware/Software (HW/SW) co-design.
- Establish understanding of the latest business-critical production ML models (Large-language models, large embedding models etc.) to inform improvements of model architecture, software system and hardware architecture.
- Develop Simulator technologies to continuously keep up with evolving new system architecture choices and new ML workloads as well as supporting simulations at different abstraction levels.
- Optimize your own code and make sure Engineers are able to optimize theirs.
- Manage your project goals, contribute to product strategy and help develop your team.
- Oversee the deployment of large-scale projects across multiple sites internationally.
Other
- 8 years of experience with software development.
- 5 years of experience in a technical leadership role; overseeing projects.
- 5 years of experience in a people management, supervision/team leadership role.
- Manage a team of 7 (eventually growing to 15) people.
- Manage a large product budget.