Lila Sciences is seeking a Machine Learning Operations Engineer to unify data management by building and maintaining high-performance data pipelines to support machine learning use-cases.
Requirements
- Proficiency with Kubernetes, Docker, and Cloud (AWS Preferred)
- Proficiency with CI/CD tools and Frameworks (GitHub Actions preferred)
- Strong skills with Scripting languages (e.g. Python, Bash), VCS (git), and Linux
- Proficiency with scalable data frameworks (Spark, Kafka, Flink)
- Proven Expertise with Infrastructure as Code and Cloud best practices
- Proficiency with monitoring and logging tools (e.g., Prometheus, Grafana)
- Experience managing on-premises kubernetes environments (e.g. Rancher)
Responsibilities
- Design and implement high-performance data processing infrastructure for large language model training
- Collaborate with researchers to implement novel data processing pipelines
- Develop an easy-to-use, secure, and robust developer experience for researchers and engineers
- Contribute to the MLOps best practices at Lila Sciences and write technical documentation for staff
Other
- 3+ years of experience in software engineering, with a focus in data engineering or DevOps
- Demonstrated experience deploying and maintaining machine learning models in production
- Proven experience in cross-functional teams and able to communicate effectively about technical and operational challenges
- Inclusive mindset and a diversity of thought
- Passion for working in unstructured and creative environments
- Ability to work in Cambridge, MA (preferred) or remotely