Advancing the frontiers of biological foundation models to understand and treat complex human diseases, specifically focusing on developing cell biological models and DNA foundation models.
Requirements
- Deep understanding of ML principles, enabling you to design, modify, and critically evaluate model architectures, not just apply existing ones.
- Significant experience in training large deep learning models.
- Well-versed in machine learning frameworks such as PyTorch or JAX.
- Experience with developing distributed training tools such as FSDP, DeepSpeed, or Megatron-LM.
- Ability to communicate and collaborate successfully with biologists and software/infrastructure engineers.
Responsibilities
- Actively participate in the design, implementation, and refinement of state-of-the-art foundation models developed in collaboration with other ML researchers and scientists at Arc with the goal of understanding and designing complex biological systems.
- Engineer large-scale distributed model pretraining and pipelines for efficient model inference.
- Enable robust systematic evaluation of trained models.
- Stay up-to-date with the latest advancements in technologies for large-scale sequence modeling and alignment, and implement the most promising strategies to ensure the underlying models remain state-of-the-art.
- Work with experimental biologists to ensure that the developed models are grounded in biologically meaningful problems and evaluations.
- Publish findings through journal publications, white papers, and presentations (both internal to Arc and external).
- Foster internal and external collaborations centered on generative design of biological systems at Arc Institute.
Other
- You are excited about working closely with a multidisciplinary team of computational and experimental biologists at Arc to achieve breakthrough capabilities in biological prediction and design tasks.
- You are a strong communicator, capable of translating complex technical concepts to researchers outside of your domain.
- You are a continuous learner and are enthusiastic about developing and evaluating a model that impacts many biological disciplines.
- Excellent communication skills, both written and verbal, with a strong track record of presentations and publications.
- Motivated to work in a fast-paced, ambitious, multi-disciplinary, and highly collaborative research environment.