Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Cerebras Systems Logo

Senior Research Engineer - Inference ML

Cerebras Systems

Salary not specified
Dec 5, 2025
Sunnyvale, CA, US
Apply Now

Cerebras Systems is looking to solve the problem of delivering industry-leading training and inference speeds for machine learning applications, by adapting today's most advanced language and vision models to run efficiently on their flagship Cerebras architecture.

Requirements

  • Strong programming skills in Python and/or C++
  • Experience with Generative AI and Machine Learning systems
  • Proficiency with at least one major ML framework (PyTorch, Transformers, vLLM, or SGLang)
  • Deep understanding of transformer-based models in language and/or vision domains, with demonstrated experience implementing and optimizing them
  • Proven ability to implement custom layers, operators, and backpropagation logic
  • Strong foundation in performance optimization on specialized hardware (e.g., GPUs, TPUs, or HPC interconnects)
  • Experience with speculative decoding, neural network pruning and compression, sparse attention, quantization, sparsity, post-training techniques, and inference-focused evaluations

Responsibilities

  • Design, implement, and optimize state-of-the-art transformer architectures for NLP and computer vision on Cerebras hardware.
  • Research and prototype novel inference algorithms and model architectures that exploit the unique capabilities of Cerebras hardware, with emphasis on speculative decoding, pruning/compression, sparse attention, and sparsity.
  • Train models to convergence, perform hyperparameter sweeps, and analyze results to inform next steps.
  • Bring up new models on the Cerebras system, validate functional correctness, and troubleshoot any integration issues.
  • Profile and optimize model code using Cerebras tools to maximize throughput and minimize latency.
  • Develop diagnostic tooling or scripts to surface performance bottlenecks and guide optimization strategies for inference workloads.
  • Collaborate across teams, including software, hardware, and product, to drive projects from inception through delivery.

Other

  • Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Electrical Engineering, or a related technical field AND 7+ years of ML software development experience
  • 4+ years of experience testing, maintaining, or launching software products, including 2+ years of experience with software design and architecture
  • 3+ years of experience in software development focused on machine learning (e.g., deep learning, large language models, or computer vision)
  • Collaborative approach with humility, eagerness to help colleagues, and commitment to team success
  • Hybrid role in Toronto, ON, CA or Sunnyvale, CA, USA