Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Machine Learning Engineer - Training Performance

Wayve

Salary not specified

Sep 19, 2025

Sunnyvale, CA, US

Wayve is seeking skilled engineers to join their Training Tech team to optimize large-scale training jobs, aiming to scale their models by the next order of magnitude and increase the efficiency of training jobs to allow Wayve to train larger models faster.

Requirements

Experience optimize large scale training jobs on GPU compute clusters.
Experience in working in platform teams and working with research teams.
Experience in reporting and tracking over time benchmarked performance in an open and accessible way.
Ability to write high quality, well-structured and tested Python code
Solid experience working with concurrent, parallel and distributed computing.
Experience using Nvidia NSight Systems.
Experience implementing GPU kernels.

Responsibilities

Profile training jobs to identify their bottlenecks, e.g. using NVIDIA Nsight Systems
Design and implement efficiency improvements to maximise MFU, e.g. tensor parallelism, model compilation, mixed precision
Design and implement observability tools, e.g. to track MFU
Collaborate closely with Research teams to integrate training efficiency improvements and create a culture of performance optimization

Other

BS or MS in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience
Full-time role based in our office in Sunnyvale.
Hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.
Operate core working hours so you can determine the schedule that works best for you and your team.