Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Wayve Logo

Machine Learning Engineer - Training Performance

Wayve

Salary not specified
Sep 19, 2025
Sunnyvale, CA, US
Apply Now

Wayve is seeking skilled engineers to join their Training Tech team to optimize large-scale training jobs, aiming to scale their models by the next order of magnitude and increase the efficiency of training jobs to allow Wayve to train larger models faster.

Requirements

  • Experience optimize large scale training jobs on GPU compute clusters.
  • Experience in working in platform teams and working with research teams.
  • Experience in reporting and tracking over time benchmarked performance in an open and accessible way.
  • Ability to write high quality, well-structured and tested Python code
  • Solid experience working with concurrent, parallel and distributed computing.
  • Experience using Nvidia NSight Systems.
  • Experience implementing GPU kernels.

Responsibilities

  • Profile training jobs to identify their bottlenecks, e.g. using NVIDIA Nsight Systems
  • Design and implement efficiency improvements to maximise MFU, e.g. tensor parallelism, model compilation, mixed precision
  • Design and implement observability tools, e.g. to track MFU
  • Collaborate closely with Research teams to integrate training efficiency improvements and create a culture of performance optimization

Other

  • BS or MS in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience
  • Full-time role based in our office in Sunnyvale.
  • Hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.
  • Operate core working hours so you can determine the schedule that works best for you and your team.