Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Machine Learning Engineer Intern, ML Runtime & Optimization (Spring 2026, Master/PhD)

Pony.ai

Salary not specified

Dec 18, 2025

Fremont, CA, US

Pony.ai is seeking to advance the training and inferences of AI models in autonomous driving systems to achieve state-of-the-art performance and efficiency in autonomous driving.

Requirements

Strong programming skills in C/C++ or Python.
Solid understanding of CPU or GPU execution model, e.g. threads, registers, cache, memory, cost and performance trade-off, etc.
Experience in benchmarking, profiling and validating performance.
Experience with parallel programming: CUDA, ROCm, Triton, Cutlass, etc.
Experience in computer vision, image processing, machine learning and deep learning.
Experience in model optimization techniques such as quantization, pruning, etc.
Experience in optimizing the utilization of compute resources, identifying and resolving compute and data flow bottlenecks.

Responsibilities

Performing in-depth analysis and optimization to model training and deployment to achieve the state of art in performance and efficiency in autonomous driving.
Work across the entire AI framework/compiler stack (e.g. Torch, CUDA and TensorRT), support model development and prototype key deep learning algorithms.
Analyze the tradeoffs between performance, cost and energy for autonomous driving.
Collaborating closely with diverse groups in Pony.ai to influence the next-generation compute platform HW and SW design.
Research the latest model architectures, programming models and hardware.

Other

Currently pursuing a Masters or PhD program or a related discipline.
Strong communication skills and ability to work cross-functionally between software and hardware teams.
This position is fully onsite in Fremont, at least 3 months.