Insmed is looking to expand what's possible for patients with serious diseases by leveraging AI and computational biology for drug development, genomics, and rare disease research.
Requirements
- Proficiency in Python or R (experience with pandas, NumPy, scikit-learn, or Bioconductor).
- Familiarity with machine learning libraries (e.g., PyTorch, TensorFlow, or XGBoost).
- Experience with data formats and tools such as FASTA/FASTQ, GSEA, Ensembl, UniProt, or GEO.
- Skills with matplotlib, seaborn, ggplot2, or Plotly.
- Knowledge of cloud computing environments (GCP, AWS, or Azure).
- Experience using SQL or Graph databases (Neo4j, RDF).
- Familiarity with Git/GitHub workflows.
Responsibilities
- Collect, clean, and harmonize multi-omics, clinical, or experimental datasets for downstream analysis and model training.
- Perform statistical and computational analyses on biological data (e.g., gene expression, proteomics, pathway enrichment, or molecular interaction networks).
- Work with AI scientists to prepare biologically meaningful datasets for machine learning workflows (feature engineering, embeddings, annotations).
- Assist in building or evaluating predictive models for target discovery, biomarker identification, or patient stratification.
- Implement and document reproducible pipelines for biological data processing using Python or R.
- Translate biological insights into computational terms and interpret model outputs back into biological context.
- Use NLP tools or APIs to extract biological or clinical insights from publications and databases.
Other
- Summer Intern - AI Computational Biologist
- Master’s or PhD degree in Computational Biology, Bioinformatics, Data Science, Computer Science, Biostatistics, Systems Biology, or related field.
- Coursework or experience in molecular biology, genomics, AI.
- Exposure to foundation models (LLMs, protein language)