Valo Health is looking to accelerate the discovery and development of new medicines by leveraging real-world healthcare data and AI. The Staff Data Scientist will help answer research questions using large real-world healthcare databases to inform the identification of biological molecules for effective drug development.
Requirements
- Must have 3+ years of experience developing and executing robust analytical strategies, including cohort and case control study design, using health care databases including electronic health records, administrative claims databases, and/or patient registries.
- Experience leading epidemiologic projects from end-to-end: from translating research questions into observational study designs, contrasting strengths and weaknesses of different study designs and statistical approaches, and generating patient-centric insights from statistical models.
- Extensive experience with causal approaches applied to observational studies, including propensity score methods, bias adjustment, and covariate selection and adjustment.
- Advanced knowledge in biostatistics approaches, including inferential and predictive modeling, and comfortable implementing unsupervised machine learning algorithms in real world health care databases.
- Must have experience conducting data manipulation and statistical analysis in Python and/or R programming languages.
- Experience developing and maintaining machine learning pipelines, and translating machine learning output into meaningful insights for diverse audiences is a plus
- Hands-on experience curating structured health data and working in health data from outside of the U.S.
Responsibilities
- As a senior member of our cardiometabolic team, you will lead real world data studies (e.g., electronic medical records) from end-to-end to generate causal evidence for projects in drug discovery and development.
- Translate research questions into observational study designs to generate patient-centric insights from statistical models.
- Curation of clinical and non-clinical variables for machine learning models
- Execution of trajectory modeling techniques using real world data
- Interpreting machine learning results into patient profiles.
- Executing post-hoc longitudinal analyses among patient profiles of interest
- Work with a diverse array of data spanning electronic medical records, sequencing, multi-omics data, and other data modalities using R and Python in cloud environments.
Other
- As a senior member of our cardiometabolic team, you will lead real world data studies (e.g., electronic medical records) from end-to-end to generate causal evidence for projects in drug discovery and development.
- Be comfortable with scientific uncertainty and embrace curiosity and creative solutions.
- Use your technical knowledge and intuition to articulate and break down large problems into solvable pieces.
- Collaborate with drug discovery and clinical development teams to help ensure the relevance and impact of the insights generated by you and your teammates.
- Be a dynamic and active team member, championing and adopting shared coding standards, participating in code review, and providing regular updates of your work and input into the work of your colleagues