Google Cloud is searching for a highly skilled and motivated engineer to optimize machine learning model performance for our customers and help them achieve maximum model performance for large scale training and inference through tuning and optimization at both software and hardware levels.
Requirements
- 5 years of experience in C++, Python, and modern deep learning toolkits like PyTorch or JAX.
- 3 years of experience in software development for machine learning model inference or machine learning model training, and 1 year of experience with ML model inference and training optimization on modern GPU/TPU architectures.
- Experience in Kernel development for TPU.
- Experience in low-level ML model optimization and willingness to learn new architectures and tools.
- Experience in developing and optimizing large-scale foundation models, including Mixture of Experts (MoE), Diffusion, and Multi-modal architectures.
- Familiarity with models and their development issues.
- Understanding of latency, memory, compute, and quality tradeoffs as they apply to ML model architectures, and practical experience in making these tradeoffs.
Responsibilities
- Optimize ML model architectures and systems for high performance across multiple TPU platforms, including onboard hardware and simulation environments.
- Enhance model and system performance for both low-latency inference and large-scale distributed training workloads.
- Develop post-training algorithms, such as quantization and low-level kernel optimizations, to increase inference speed and reduce memory consumption on modern GPU and TPU architectures.
- Engineer custom kernels to maximize training efficiency for memory-bound large models and I/O-bound fine-tuning processes.
- Collaborate with ML infrastructure teams, hardware and simulation departments, and Alphabet’s research teams to integrate cross-functional optimizations.
Other
- Sunnyvale, CA, USA; Kirkland, WA, USA
- Ability to maintain agility and deliver results in a changing environment.
- excellent communication abilities
- a passion for mentoring junior engineers
- display leadership qualities and be enthusiastic to take on new problems across the full-stack