Generate:Biomedicines is seeking to solve the problem of creating breakthrough medicines using Generative Biology, which requires a strong cloud and DevOps foundation to support research at the intersection of machine learning and computational biology.
Requirements
- Foundations in Kubernetes and Linux operations: experience working with Kubernetes in production or production adjacent environments, and familiarity with troubleshooting networking or performance issues, upgrades, or migrations.
- Exposure to cloud and networking concepts: hands-on experience with at least one major cloud provider and an interest in learning more complex networking and hybrid connectivity patterns over time, including GPU focused environments and AWS.
- An automation-oriented mindset: experience improving workflows, reducing manual effort, or introducing guardrails, along with curiosity about modern DevOps practices that help platform systems scale beyond individual contributors.
- Interest in observability and cost awareness: some experience with dashboards, alerting, tracing, or telemetry, and a willingness to learn how to use signals to improve system behavior, performance, and cost efficiency.
- Experience with containers and delivery practices: experience building or modifying container images and deployment workflows, and an interest in learning best practices around image hardening, reproducibility, and reliable delivery.
Responsibilities
- Build and evolve our Kubernetes and compute platform: operate and improve shared clusters and associated tooling that support internal services, CI runners, and compute-heavy research workflows.
- Drive automation-first DevOps: work with partner teams to reduce manual operations by improving deployment patterns, self-service capabilities, and operational guardrails, enabling teams to ship and run reliably with fewer one-off interventions.
- Improve observability and performance through practical signals: design and iterate on dashboards, alerting, and instrumentation practices, including performance tuning loops, that help teams understand workload behavior, detect issues early, and make informed tradeoffs around efficiency and cost.
- Strengthen infrastructure governance and delivery systems collaboratively: contribute to well-structured IaC workflows and change management practices, such as Terraform with review and apply processes, and help improve CI/CD reliability so infrastructure and application changes are safe, auditable, and timely.
- Be a trusted teammate on a small platform group: collaborate closely through pairing, reviews, documentation, and shared ownership, and help build durable operational readiness through runbooks, training, and clearly defined operational standards.
Other
- 5+ years of relevant engineering experience
- Collaborative approach and focus on delivering outcomes
- Ability to work onsite in Somerville, MA location and require 2+ days/week in the office
- Commitment to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status
- Strong technical judgment and ability to assess complex systems, make thoughtful decisions, and partner with others to execute effectively