Waymo is looking to improve compute performance on their car and in simulation by optimizing the Hardware Abstraction Layer (HAL) to process sensor data super-fast and feed it to the Perception models.
Requirements
Strong C++ programming skills
Experience with CPU optimization
Experience with system-level optimization
Experience with GPU optimization (CUDA)
Experience with compiler technology
Responsibilities
Implement highly efficient sensor data processing algorithms.
Optimize existing CPU code.
Write CUDA kernels to speed up specific operations.
Optimize end-to-end system latency.
Collaborate with ML practitioners to understand their input-processing needs.
Other
B.Sc in Computer Science, Mathematics or a related field
5+ years of industry experience
M.Sc or PhD in Computer Science, Mathematics or a related field (preferred)