Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

MLE, Post-Training & Reinforcement Learning Frameworks

Waymo

$204,000 - $259,000

Oct 14, 2025

Mountain View, CA, US

Waymo is looking to train and improve pre-trained models for their autonomous driving software, specifically for Perception and Planning, to be deployed into the Waymo Driver and potential future products. This involves tackling challenges in large-scale reinforcement learning (RL) and building scalable systems for compute, data, and environments to enhance model intelligence and alignment with human drivers.

Requirements

Proficient in distributed systems design with an understanding of ML efficiency.
Experience with ML frameworks, including TensorFlow, JAX, XLA.
Solid programming skills in Python and C++.
Practical familiarity with profiling tools to uncover performance bottlenecks.
Familiarity with post-training frameworks like TS/REX, Tunix, TorchRL, TRL, etc.

Responsibilities

Develop the core training system for adapting RL techniques to unprecedented scales and heterogeneous environments (i.e. CPU/GPU/TPU).
Collaborate with cross-functional teams to integrate cutting-edge rollout strategies, policies, and RL algorithms (i.e. REINFORCE, DPO, PPO, etc.) into the system.
Optimize the end to end RL training pipeline for efficient and scalable learners/actors, and low-latency distributed reply buffers for persisting data produced by the rollouts.
Build robust evaluations, analyze experimental results and iterate quickly to improve model performance and training workflows.
Stay current with the latest research in RL, Vision-Language-Action (VLA) models, and World models to inform and inspire new initiatives.

Other

B.S. in Computer Science, Math, or 8+ years equivalent real-world experience.
MS in Computer Science, Math