Today’s data platforms are built on top of tools made for spreadsheet-like analytics, not the petabytes of multimodal data that power AI. As a result, teams waste months on brittle infrastructure instead of conducting research and building their core product. Eventual was founded to make querying any kind of data, images, video, audio, text, as intuitive as working with tables, and powerful enough to scale to production workloads.
Requirements
- 3+ years of experience working with complex infrastructure projects, ideally involving GPUs.
- Experience supporting ML/AI workloads.
- Experience optimising GPU utilisation and job placement through scheduler extensions, such as kube-scheduler plugins, Slurm and Volcano.
- Familiarity and experience with cloud technologies (e.g. AWS S3 etc)
- Experienced with taking a product from ground zero to production.
Responsibilities
- Design and build highly reliable and resilient products and features.
- Work closely with cross-functional product and customer-facing teams to understand requirements and ship thoughtful solutions.
- Write high-quality, extensible, and maintainable code.
- Design and build scalable applications and components.
- Architect and operate Kubernetes clusters optimised for GPU workloads.
- Build multi-tenant and dedicated environments for GPU workloads.
Other
- Please note we're looking for individuals who are excited to be a part of a tight-knit team working together 4 days / week in our SF Mission district office.
- You will work with a tight-knit team that values open communication and cross-functional collaboration.
- While we are an experienced team that can provide constant guidance and mentorship, we value engineers who can autonomously scope and solve difficult technical challenges.
- A unique startup growth mindset of responsibly take on tech debt for velocity and building for future extensibility
- Most importantly, we are looking for someone who works well in small, focused teams with fast iterations and lots of autonomy.