Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Research Engineer - Inference ML

Cerebras Systems

Salary not specified

Dec 5, 2025

Sunnyvale, CA, US

Cerebras Systems is looking to solve the problem of delivering industry-leading training and inference speeds for machine learning applications, by adapting today's most advanced language and vision models to run efficiently on their flagship Cerebras architecture.

Requirements

Strong programming skills in Python and/or C++
Experience with Generative AI and Machine Learning systems
Proficiency with at least one major ML framework (PyTorch, Transformers, vLLM, or SGLang)
Deep understanding of transformer-based models in language and/or vision domains, with demonstrated experience implementing and optimizing them
Proven ability to implement custom layers, operators, and backpropagation logic
Strong foundation in performance optimization on specialized hardware (e.g., GPUs, TPUs, or HPC interconnects)
Experience with speculative decoding, neural network pruning and compression, sparse attention, quantization, sparsity, post-training techniques, and inference-focused evaluations

Responsibilities

Design, implement, and optimize state-of-the-art transformer architectures for NLP and computer vision on Cerebras hardware.
Research and prototype novel inference algorithms and model architectures that exploit the unique capabilities of Cerebras hardware, with emphasis on speculative decoding, pruning/compression, sparse attention, and sparsity.
Train models to convergence, perform hyperparameter sweeps, and analyze results to inform next steps.
Bring up new models on the Cerebras system, validate functional correctness, and troubleshoot any integration issues.
Profile and optimize model code using Cerebras tools to maximize throughput and minimize latency.
Develop diagnostic tooling or scripts to surface performance bottlenecks and guide optimization strategies for inference workloads.
Collaborate across teams, including software, hardware, and product, to drive projects from inception through delivery.

Other

Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Electrical Engineering, or a related technical field AND 7+ years of ML software development experience
4+ years of experience testing, maintaining, or launching software products, including 2+ years of experience with software design and architecture
3+ years of experience in software development focused on machine learning (e.g., deep learning, large language models, or computer vision)
Collaborative approach with humility, eagerness to help colleagues, and commitment to team success
Hybrid role in Toronto, ON, CA or Sunnyvale, CA, USA