Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Rivian Logo

Staff Software Engineer - ML Training and Inference Infrastructure

Rivian

$228,000 - $285,000
Sep 11, 2025
Palo Alto, CA, USA
Apply Now

Rivian is looking to establish a state-of-art ML infrastructure for training and inference of large autonomous driving models and optimize their performance.

Requirements

  • Deep knowledge of PyTorch
  • Knowledge of model training framework (e.g. PyTorch Lightning, ray, etc.)
  • In-depth knowledge of transformer architecture and ways to accelerate the training and inference of transformer models
  • Experience of performing large scale distributed training of models
  • A track record of profiling models and doing detective work to improve model training and inference speed
  • Experience with CUDA or Triton language for writing custom ops
  • Knowledge of Nvidia TensorRT

Responsibilities

  • Optimize the performance of Deep Learning training workload on NVIDIA GPU systems on a large scale
  • Optimize the latency of model inference and model pre- and post-processing on onboard systems
  • Design, train, and deploy large deep learning models that can leverage the vast amount of labeled and unlabeled data

Other

  • PhD in CS/CE/EE, or equivalent, in industry experience
  • A track record of efficiently solving complex problems collaboratively on larger teams
  • Experience with edge computing systems