SandboxAQ is looking to advance the frontiers of drug and materials discovery by integrating physics-based simulations with cutting-edge AI.
Requirements
- Ph.D. in Computer Science, High-Performance Computing, or a related field
- 3–5 years of hands-on experience, preferably in the private sector, working on one or more of the following: Probabilistic or causal modeling, Large-scale graph algorithms, Graph neural networks
- Experience in processing and curating multi-modal data—including large-scale omics, clinical datasets, and scientific literature
- Proficiency in running analyses and training machine learning or deep learning models in high-performance computing (HPC) environments, particularly those using GPUs
- Familiarity with advanced AI concepts, including: Generative AI (LLMs, Biological Foundation Models), Probabilistic Graphical Models (e.g., Bayesian Networks, Markov Networks, deep learning extensions), Causal inference (e.g., do-calculus, recent developments in causal discovery)
- Experience with cloud platforms such as Google Cloud Platform (GCP) or AWS for data storage and compute
- Working knowledge of graph databases and graph data structures
Responsibilities
- Develop robust, scalable software systems that enable large-scale causal reasoning
- Design and implement algorithms to advance understanding of causality in complex biological systems
- Apply advanced graph-based reasoning techniques—including Graph Neural Networks, Probabilistic Graphical Models, and LLMs—for querying and inference over large-scale causal biomedical knowledge graphs constructed from simulation, omics data, and literature
- Identify, ingest, and curate relevant data sources. Own data quality control, validation, and integration workflows
- Research and prototype novel bioinformatics and deep learning approaches to interpret human genetic variants, gene regulation mechanisms, gene expression dynamics, and disease pathways using diverse multimodal data
- Communicate complex ideas effectively across audiences, including internal collaborators, external stakeholders, and clients—tailoring technical depth as needed
- Contribute to the scientific community through patent filings, peer-reviewed publications, white papers, and conference presentations
Other
- Strong collaboration mindset, with the ability to identify problems and communicate technical concepts clearly to both technical and non-technical stakeholders
- Demonstrated ability to dive deep into technically complex problems and a track record of driving initiatives through to completion
- Willingness to travel up to 25% for conferences, customer engagements, team offsites, or internal meetings
- Basic understanding of molecular biology concepts, particularly the central dogma (DNA, RNA, protein), and related high-throughput technologies such as RNA-seq, epigenomics, single-cell and spatial omics
- Strong publication record in peer-reviewed venues (eg. NeurIPS, ICML, ICLR, CVPR, ECCV, ICCV)
- Ph.D. in Computer Science, High-Performance Computing, or a related field