Waymo is looking to improve its autonomous driving technology by developing and implementing complex computations on domain-specific hardware architectures, requiring interns to understand and manipulate multi-dimensional tensors and map operations to unfamiliar hardware.
Requirements
- GPU, Pallas or other accelerator programming
- Computer architecture fundamentals
- Professional competency in C++ or Python
- Experience with compilers for CPU or GPU or ML accelerators
- Experience with NVIDIA CUTLASS, CuTile, Pallas, or Triton-like Domain Specific Languages
- Experience with MLIR, JAX, ONNX, StableHLO or similar compiler intermediate representations
Responsibilities
- Understand difficult-to-visualize manipulations and computations on 4-5 dimensional tensors, devise ways to express them cleanly and implement them on domain-specific architectures
- Deep dive into the capabilities of unfamiliar hardware to map operations to it
- Quickly create minimally featured software prototypes with light automated testing to explore a design space
Other
- Currently pursuing a PhD in related field or Masters with significant industry experience
- This will be a hybrid onsite internship position.
- We will accept resumes on a rolling basis until the role is filled.
- To be in consideration for multiple roles, you will need to apply to each one individually - please apply to the top 3 roles you are interested in.