Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Tesla Logo

AI Engineer, ML Inference Optimization, Autonomy & Robotics

Tesla

$124,000 - $420,000
Nov 7, 2025
Palo Alto, CA, United States of America
Apply Now

Tesla’s AI team is pushing the frontier of real-world machine learning, building models that reason, predict, and act with human-level physical intelligence, and the company needs to design and optimize models to run efficiently across Tesla’s diverse compute stack

Requirements

  • Proven experience in scaling and optimizing inference for large ML models, particularly transformers or similar architectures
  • Familiarity with quantization-aware training, model compression, and distillation for edge and real-time inference
  • Proficiency with Python and C++ (modern standards 14/17/20) and deep learning frameworks such as PyTorch, TensorFlow, or JAX
  • Strong understanding of computer systems and architecture, with experience deploying ML models on GPUs, TPUs, or NPUs
  • Hands-on expertise with CUDA programming, low-level performance profiling, and compiler-level optimization (TensorRT, TVM, XLA)
  • Experience collaborating with compiler/hardware engineers to bridge model and system-level optimization
  • Excellent problem-solving skills and the ability to debug and tune high-performance inference workloads

Responsibilities

  • Design, train, and deploy large neural networks that run efficiently on heterogeneous hardware (GPU, CPU, Tesla’s in-house AI ASIC)
  • Develop and integrate quantization, sparsity, pruning, and distillation techniques to improve inference performance
  • Design inference algorithms that improve inference performance in terms of quantization and latency
  • Profile and improve latency, throughput, and memory efficiency for large ML models across edge and cloud environments
  • Collaborate with compiler and hardware engineers to co-design architectures for efficient real-time inference
  • Design and implement custom GPU kernels (CUDA / OpenCL) to accelerate model operations and post-processing pipelines
  • Conduct systematic benchmarking, scaling, and validation of inference performance across Tesla platforms

Other

  • Bachelor's, Master's, or Ph.D. degree in Computer Science or related field
  • Travel may be required
  • Must be eligible to work in the United States
  • Excellent communication and collaboration skills
  • Ability to work in a fast-paced environment