The Arc Institute is seeking to advance the state-of-the-art in generative AI applied to biology, including frontier DNA foundation models (Evo and Evo 2), to develop ML models for biological data and apply them to computational biology applications in genome mining, molecular technology development, and invention of new therapeutic approaches.
Requirements
- Hands-on experience with training and evaluating the performance of machine learning models for large datasets
- High competency with Python, bash, and standard deep learning frameworks such as PyTorch.
- Experience with Linux, git/GitHub, Docker, and Jupyter/RMarkdown notebooks.
- Strong understanding of modern deep learning, computational biology, and genetics
- Familiarity with recently reported projects in the lab, including our recent work in the fields of generative genomics and biological design: genomic language modeling (Evo, Evo 2), protein language modeling, and discovering new genome engineering tools
Responsibilities
- Train and evaluate state-of-the-art machine learning models for molecular biology and genomics
- Develop methods for applying biological language models to the study of both eukaryotic and prokaryotic genome function and gene discovery
- Apply recent advances in mechanistic interpretability, such as sparse autoencoders, to the study of genomic language models such as Evo 2
- Develop, test, and maintain modular open-source software to accelerate adoption and application of genomic language models.
- Contribute directly to ongoing discovery projects in the lab in the fields of generative genomics, genome engineering, and therapeutics
- Effectively communicate analysis results to experimental scientists as well as computational scientists
Other
- Ph.D. in Computer Science, Bioinformatics, Computational Biology, Genetics/Genomics, with 0-5 years of industry/academia experience post degree.
- Appreciation for how choices in experimental design affect the data analysis process.
- Enjoy working collaboratively and cross-functionally with experimental scientists.
- Passionate about developing machine learning models with real-world applications and scientific impact
- Excited about collaborating with a multidisciplinary team of experimental biologists and machine learning researchers at Arc