Google's ML Strategy and Allocation Committee (MLSA) Technical Core Team needs to shape the future of Google's AI by optimizing vast ML compute resources. As AI investment accelerates, efficient allocation, strategic alignment, and technical governance of the ML fleet are paramount for success. This role will develop frameworks, policies, and technical strategies to ensure ML compute capacity aligns with strategic priorities, is used efficiently, and operates as a consistent, exceptional system, addressing challenges in infrastructure efficiency, technical governance, and resource allocation at an unprecedented scale.
Requirements
- 8 years of experience in software development.
- 7 years of experience leading technical project strategy, ML design, and working with industry-scale ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
- 5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
- 5 years of experience with design and architecture, and testing/launching software products.
- Experience with Machine Learning, Distributed Computing, Build Infrastructure, and Cluster Management.
- Experience with C++, Python, Machine Learning Infrastructure, Compilers, Computer Architecture, Debugging, etc.
Responsibilities
- Lead technical analysis across Google’s ML infrastructure (including training, serving, and scheduling) to identify opportunities for efficiency gains, cost optimization, and improved resource utilization.
- Develop data-driven proposals and recommendations for executive leadership.
- Partner with executive technical leads across serving, training, scheduling, and fleet management to establish and drive technical governance.
- Collaborate with MLSA leadership to translate Google's strategic AI priorities into a concrete technical roadmap for the ML compute resources, ensuring the capacity planning and allocation strategies support critical initiatives.
- Serve as a key technical consultant and guide for Product Areas (PAs) and engineering organizations.
- Help them navigate MLSA policies, optimize their capacity consumption, and align their roadmaps with Google’s overall ML strategy.
- Develop and advocate technical proposals for new frameworks, tools, and systems that enable more efficient and dynamic allocation of ML resources.
Other
- Bachelor’s degree or equivalent practical experience.
- Experience in a technical leadership role, with the ability to lead complex, cross-functional engineering projects from conception to completion.
- Versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack.