FL117, a venture-back stealth AI x bio startup, is seeking a highly skilled ML Scientist to build out the architectures, training pipelines and overall development of next-gen multi-modal generative models for biomolecular design, pushing the frontier of AI-powered drug discovery.
Requirements
- Experience training large-scale generative (transformer/diffusion) models on HPC or multi-GPU, multi-node distributed systems
- Proficiency in Python and relevant ML libraries (PyTorch or TensorFlow)
- Knowledge of biological data formats (FASTA, SDF, PDBx/mmCIF) and domain-specific ontologies/metadata standards
- Experience with models like ESM, DiffDock, ProteinMPNN, OpenFold, etc.
- Familiarity with ML software development and cloud services (Docker, AWS Batch, Step Functions, EKS, AWS ParallelCluster, etc.)
- Experience with libraries like: rdkit, rosetta, prody, biopython, pymol, openmm, dagster, prefect, etc.
Responsibilities
- Spearhead the design and development of cutting-edge deep learning models aimed at advancing generative biomolecular design.
- Design novel representation and tokenization of biomolecules to enable more efficient transfer learning
- Own and establish a robust research infrastructure.
Other
- MSc or PhD in Computer Science, AI/ML, Physics, Biophysics, or a related field
- 2+ years of internship/full-time industry or postdoctoral ML research experience
- 2+ years of experience developing production-grade machine learning solutions
- Proven ability to lead technical initiatives, communicate progress, and drive innovation in generative biomolecular design
- Contributed to open-source data projects in biology/chemistry. Please send us your GitHub, if applicable.
- Strong presence in the ML community through publications and conference proceedings. Please send a list of publications, if applicable.