Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

ML Runtime Optimization Engineer

$159,053 - $199,295

Oct 28, 2025

Mountain View, CA, US

Applied Intuition is looking to optimize ML model performance on production-grade embedded runtime environments.

3+ years of experience with ML accelerators, GPU, CPU, SoC architecture and micro-architecture
Strong software development skills with the focus on embedded programming
Experience profiling and optimizing model performance on embedded compute platforms
Experience in working with deep learning frameworks (e.g., PyTorch, JAX, ONNX, etc.)
M.Sc or PhD in a ML related area
Built an ML optimization framework from scratch before
Deployed ML solutions to embedded chips for real time robotics applications

Drive ML performance optimization on multiple technologies for on-road and off-road ADAS / AD stacks targeting deployment on a variety of embedded compute platforms
Develop compute usage strategies to optimize efficiency and latency of model inference for compute boards selected by our customers
Work on model pruning and quantization, and support deployment on memory constrained platforms
Collaborate closely with ML engineers and software developers on technical efforts to find and optimize efficient model architecture solutions
Set up methodologies to profile the model performance on target embedded compute platforms and identify performance bottlenecks as part of stack integration

Bachelors in Electrical Engineering or Computer Science, OR B.Sc. in Computer Science, Mathematics, Physics or a related field
In-office work 5 days a week
401k retirement benefits with employer match, learning and wellness stipends, and paid time off