Altos Labs is looking to solve the business problem of restoring cell health and resilience through cell rejuvenation to reverse disease, injury, and disabilities. The Data Scientist will contribute to this mission by exploring statistical models, understanding data and model behavior, and generating insights focused on biological mechanisms and hypothesis validation within the fields of cells and genomics.
Requirements
- Working knowledge of cell biology and experience in large scale data analysis and statistical modeling on datasets like RNA-seq, ATAC-seq, protein network, pathways, etc.
- Strong breadth and expertise in Statistical analysis, machine learning, data visualization, programming (Python, R, etc.), data cleaning and data manipulation.
- Tools: Python, R, SQL, TensorFlow, Scikit-learn, Tableau, Power BI.
- Strong experience in programming and comfortable modifying existing code-base. Experience with Python, R data cleaning and data manipulation or other related scientific languages.
- Strong and demonstrable experience working in an AWS compute environment is a major advantage.
- Experience integrating prior knowledge from public databases (e.g., KEGG) into -omics data analysis pipelines.
Responsibilities
- Generate insights and models from multi-omics datasets (using public and internal data) to understand patterns, trends and relationships within data to inform decision-making and solve problems.
- Design, develops and programs methods, processes, and systems to extract, consolidate and analyze unstructured, diverse “big data” sources to generate actionable insights that form a hypothesis.
- Build databases and is responsible for the curation of Data, including experimental data management.
- Extracting knowledge, insights and predictions from data using Bayesian statistical methods, machine learning and data visualization to gain insights.
- Work with scientists to identify optimal ways to prepare, annotate, store and navigate their datasets, including data application design and improvement.
- Define and document best practices for capturing and entering experimental metadata, and educate scientists and collaborators about these standards.
- Build pipelines for quality control, processing and analysis of raw targeted and untargeted datasets.
Other
- PhD in interdisciplinary quantitative science such as Biology, Chemistry, Computer Science, Physics, etc.
- 2+ years of relevant work experience in either an academic or industry setting.
- Proven track record of completed scientific projects as evidenced by publications and preprints.
- Ability to generate high quality ideas and be self-driven to explore.
- Willing to work in a collaborative environment and share periodic updates across the company.