Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Cerebras Systems Logo

Product Manager - AI Cluster Management Software New

Cerebras Systems

Salary not specified
Aug 26, 2025
Sunnyvale, CA, US
Apply Now

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. The Cerebras AI cluster management software is a strategic software product initiative intended to deliver Cerebras’s high performance AI benefits to on-premises customers and sovereign/neo clouds. Cerebras cluster management is intended to simplify deployment and maintenance for platform operators, making it easier to manage complex AI infrastructure at scale.

Requirements

  • 5+ years of product management experience, preferably in infrastructure observability and security domains.
  • Expert knowledge of security tools such as IAM, IDP, SIEM, Key management systems.
  • Expert knowledge of observability solutions such as Prometheus, Grafana, log management systems, observability management systems.
  • Familiarity with cluster orchestration tools and concepts (e.g., Kubernetes).
  • Strong ability to think at the API and platform layers, designing solutions for operator workflows.
  • Technical background (e.g., computer science, engineering) or the ability to engage deeply with engineering teams.
  • Experience in enterprise software, cloud infrastructure, or AI/ML platforms.

Responsibilities

  • Define and deliver a world-class cluster management experience with a focus on observability, management, monitoring and security.
  • Collaborate with engineering to design reliable, scalable solutions and APIs tailored to cluster operator workflows.
  • Develop a deep understanding of cluster operator needs through user and market research.
  • Communicate product updates and roadmap progress clearly to internal and external stakeholders.

Other

  • Excellent communication and collaboration skills, with the ability to work effectively across diverse teams.
  • Proven ability to excel in a fast-paced, dynamic environment.
  • Understanding of security and authentication principles in software systems.
  • Familiarity with monitoring, telemetry, and fault tolerance in distributed systems.