Waymo is seeking an ML Engineer to optimize neural network inference and training for their autonomous driving systems, enhancing scalability and multi-platform support across diverse hardware like GPUs and TPUs while meeting real-time constraints.
Requirements
- 3+ years of experience in software development for neural model inference or training
- 1+ years of experience optimizing neural model inference and training on modern GPU/TPU architectures
- 5+ years of experience in software development for real-time systems, preferably on embedded or onboard systems
- Proficiency in C++, Python, and deep learning frameworks such as PyTorch or JAX
- Strong passion for low-level neural network optimization and continuous learning of new architectures and tools
- Deep understanding of latency, quality tradeoffs, and practical experience managing these in neural network architectures
Responsibilities
- Optimize neural network architectures and systems for high performance on multiple GPU and TPU platforms, including onboard and simulation environments
- Enhance neural model performance and overall system efficiency for systems with real-time constraints
- Develop post-training algorithms such as quantization and low-level kernel optimizations to improve inference speed and reduce memory consumption
- Create new neural model architectures and decoding strategies, including sparse architectures and speculative decoding, to boost inference performance
- Improve model training speed and efficiency, especially for large models that are memory-bound and for fine-tuning tasks
- Collaborate with ML infrastructure teams, onboard hardware teams, simulation teams, and research units to implement and deploy optimized models
- Stay updated with the latest developments in efficient ML techniques and incorporate these innovations into production systems
Other
- Master’s degree or PhD in Computer Science, Engineering, or a related technical field
- Hybrid role
- Collaboration with infrastructure, onboard hardware, simulation teams, and research units
- Passion for low-level neural network optimization and continuous learning of new architectures and tools
- Competitive salary range of $238,000 to $302,000 USD, commensurate with experience and location