Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Cerebras Systems Logo

ML Integration & Operations Software Engineer

Cerebras Systems

Salary not specified
Sep 17, 2025
Sunnyvale, CA, US
Apply Now

Cerebras Systems builds the world's largest AI chip, aiming to provide industry-leading training and inference speeds and empower machine learning users to effortlessly run large-scale ML applications without the hassle of managing hundreds of GPUs or TPUs.

Requirements

  • 2+ years of experience in software integration, debugging, or quality engineering.
  • Strong programming and automation skills in Python, C++, Go, or similar languages.
  • Experience testing compute, machine learning, networking, or storage systems in large-scale environments.
  • Solid understanding of system architecture (compute, networking, storage) and ML workloads.
  • Proven ability to break down complex issues into root causes and scalable solutions across distributed or complex systems.
  • Ability to understand complex systems and design comprehensive, effective test plans.
  • Experience with ML workloads such as LLM or multimodal model training and inference.

Responsibilities

  • Debug and resolve complex integration issues across the Cerebras AI platform, spanning ML, compiler, runtime, and hardware layers.
  • Develop and deploy AI-enhanced debugging and validation tools to accelerate issue identification and resolution.
  • Automate test generation, data capture, and diagnostics using scripting and intelligent systems.
  • Create and execute robust validation plans for LLM and multimodal workloads in production-scale environments.
  • Identify edge cases, stress failure modes, and proactively improve system resilience.
  • Design and maintain CI/CD pipelines to ensure continuous integration, fast feedback, and early detection of regressions.
  • Contribute to continuous improvement by implementing quality metrics, automation pipelines, and actionable insights.

Other

  • This role follows a hybrid schedule, requiring in-office presence 3 days per week.
  • Strong collaboration and communication skills across cross-functional teams.
  • Experience collaborating with globally distributed teams across time zones.
  • People who are serious about software make their own hardware.
  • Our simple, non-corporate work culture that respects individual beliefs.