Septerna is seeking an experienced Data Scientist to drive drug discovery efforts through the application of advanced data science and machine learning techniques, specifically for hit identification and molecular property prediction using GPCR datasets.
Requirements
- Strong understanding of the principles and application of machine learning in drug discovery.
- Proven experience in building and deploying ML models for tasks such as chemical property prediction or hit identification.
- Proficiency in handling and analyzing chemical data using relevant tools and libraries (e.g., RDKit).
- Experience with various molecular representations (SMILES, fingerprints, graphs).
- Strong programming skills in Python and experience with relevant data science and machine learning libraries (e.g., scikit-learn, TensorFlow, PyTorch).
- Familiarity with databases and experience working with large datasets.
- Experience with cloud computing platforms for ML model training and data storage.
Responsibilities
- Apply machine learning techniques (e.g., deep learning, graph neural networks, ensemble methods) to drug hit finding and molecular property prediction.
- Work extensively with chemical data, including various molecular representations (e.g., SMILES, molecular graphs, fingerprints).
- Utilize machine learning libraries, platforms, and infrastructure to build, train, and evaluate models.
- Develop and implement robust data pipelines for preparing and analyzing chemical and biological data.
- Evaluate the performance of models using appropriate metrics and iterate to improve their accuracy and applicability.
- Stay current with the latest advancements in machine learning for drug discovery and recommend their adoption where appropriate.
Other
- Ph.D. in a quantitative field (e.g., Data Science, Computer Science, Chemistry) with 5+ years of relevant industry experience.
- Collaborate closely with chemists, biologists, and other data scientists to understand project goals and translate them into data science solutions.
- Communicate findings and insights clearly and effectively to both technical and non-technical audiences.
- Excellent communication, presentation, and collaboration skills.
- Experience working with DNA-encoded library (DEL) data and analysis.