Glyphic Biotechnologies is seeking a Senior/Staff Data Scientist to assist in the advancement of their cutting-edge single molecule proteome sequencing platform, which has the potential to transform how we understand biology and develop new medicines.
Requirements
- PhD in Computer Science, Bioinformatics, Computational Biology, Biostatistics or related field with 4+ (Senior) or 6+ (Staff) years of hands-on experience.
- Proven ability to model and interpret high-dimensional datasets with numerous interacting variables, uncovering statistically robust patterns and causal relationships.
- Competency in chemistry data science (e.g., interpreting LCMS data, utilizing deconvolution tools, understanding surface chemistry and molecule-target interactions).
- Competency in next generation sequencing, including familiarity with multi-omics, error modeling, and basecalling.
- Expertise in Python and/or R for biostatistical analysis, including data wrangling, statistical modeling, and visualization of high-dimensional experimental results.
- Experience designing ML models for experimental data and deploying pipelines (Snakemake, Nextflow).
- Familiarity with ML frameworks (PyTorch, TensorFlow) and data science libraries (pandas, numpy, scipy).
Responsibilities
- Design and implement novel algorithms to analyze proteomics data that no one has ever seen before.
- Develop machine learning models that can extract meaningful insights from complex, noisy biological signals.
- Develop and optimize algorithms for analyzing high-dimensional chemistry and NGS data, including single cell, spatial data, and LCMS data outputs
- Build models that reveal how parameters and molecular interfaces drive outcomes, including surface interactions and molecule-target binding.
- Design and execute biostatistical analyses using Python and/or R to uncover significant trends, model experimental outcomes, and inform data-driven decision-making.
- Apply machine learning to guide experiment design, identify key parameters, and optimize workflows for efficiency and reproducibility.
- Create ETL pipelines that clean, normalize, and integrate diverse datasets (sequencing reads, LCMS spectra, metadata) into analysis-ready formats.
Other
- Excellent interpersonal skills – capable of building strong relationships and communicating effectively with stakeholders at all levels.
- High emotional and analytical intelligence – able to navigate complex team dynamics, partnerships, and challenges with creativity and logic.
- Resourceful adaptability – operates with urgency, remains flexible in evolving environments, and thrives in ambiguity.
- Collaborative spirit – enjoys working across disciplines and explaining complex concepts to diverse audiences.
- PhD in Computer Science, Bioinformatics, Computational Biology, Biostatistics or related field