The Driver Understanding and Evaluation (DUE) team at Waymo is focused on deeply understanding and assessing the performance of the Waymo Driver. They develop and utilize advanced evaluation methodologies, including machine learning-based metrics and reward functions, to analyze driving behavior and capabilities. Collaborating broadly with Simulation, System Engineering, Research, and Onboard Software teams, DUE plays a critical role in the validation and improvement of the Waymo Driver, including the evaluation of foundational AI models. Their efforts support the scalable and rigorous assessment necessary to ensure the safety and efficacy of Waymo's autonomous technology.
Requirements
- Strong programming proficiency in Python and hands-on experience with deep learning frameworks (e.g, Tensorflow)
- Solid theoretical understanding of machine learning and reinforcement learning fundamentals, including Markov Decision Processes value functions, and policy gradients
- Practical experience applying modern deep RL algorithms (e.g., PPO, SAC, TD3) to continuous control problems
- Familiarity with software development best practices, including version control with Git
- Knowledge of advanced reward learning paradigms such as Reinforcement Learning from Human Feedback (RLHF) or Inverse Reinforcement Learning (IRL)
- Experience with large-scale or distributed training of machine learning models
Responsibilities
- Implemented reinforcement learning algorithms and novel reward functions within a high-fidelity autonomous driving simulator
- Design and run experiments to train the driving agent, systematically collecting and analyzing performance data to identify behavioral patterns and potential instances of reward hacking
- Interactively refine the driving policy and the reward model based on experiment results, implementing and testing strategies to improve safety and alignment with intended driving behavior
- Collaborate with the research team to document experimental setups, present key findings, and contribute to the project's technical direction
Other
- This will be a hybrid onsite internship position.
- We will accept resumes on a rolling basis until the role is filled.
- To be in consideration for multiple roles, you will need to apply to each one individually - please apply to the top 3 roles you are interested in.