Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

BioSpace Logo

Senior Data Engineer

BioSpace

$180,000 - $230,000
Sep 14, 2025
New York, NY, US
Apply Now

Formation Bio is looking to build the semantic layer that makes diverse data pillars interoperable, consistent, and actionable to accelerate drug development and clinical trials.

Requirements

  • Strong SQL and data modeling skills, with proven experience designing semantic or analytical layers.
  • Experience working with both structured data (e.g., relational tables, APIs) and unstructured data (e.g., documents, free text, biomedical literature, healthcare notes).
  • Familiarity with healthcare/life sciences ontologies (SNOMED CT, ICD, RxNorm, LOINC, HL7 FHIR, OMOP, Mondo) and/or financial/commercial taxonomies.
  • Hands-on experience with Snowflake, dbt, Dagster, and modern data stacks.
  • Experience with unstructured data workflows (NLP, embeddings, semantic search, knowledge graphs).
  • Practical use of metadata management and data catalog platforms.
  • Hands-on experience structuring dbt projects with testing, quality checks, and reusable design patterns.

Responsibilities

  • Build and maintain SQL/dbt models that unify datasets across healthcare, commercial/pharma, biomedical, and finance domains, leveraging ontologies (e.g., SNOMED CT, ICD, RxNorm, HL7 FHIR, OMOP).
  • Design models that handle not only structured datasets but also unstructured data sources (e.g., documents, free text, biomedical literature), preparing them for AI-driven applications.
  • Own and evolve the semantic layer that transforms raw data into consistent, reusable models powering analytics and advanced AI.
  • Contribute to pipelines that bring in data from APIs, partner feeds, flat files, and unstructured text, ensuring inputs are reliable, well-documented, and metadata-rich.
  • Apply FAIR principles to ensure data is traceable, interoperable, and reusable across structured and unstructured domains.
  • Partner with commercial, scientific, finance, and healthcare stakeholders to align semantic models with real-world use cases.
  • Document data standards and reusable modeling patterns to empower downstream teams and reduce cognitive load.

Other

  • 5+ years of experience as a Data Engineer, Analytics Engineer, or similar role in healthcare, pharma, biotech, finance, or other highly regulated industries.
  • Deep expertise in at least one data domain (e.g., healthcare/EHR/claims, commercial/pharma, biomedical/scientific, or finance), with a track record of translating complex, domain-specific datasets into consistent and usable models.
  • Exposure to additional domains beyond your core area of expertise, and the ability to learn and adapt to new datasets quickly.
  • Understanding of regulatory and compliance considerations in healthcare, pharma, or finance.
  • Please only apply if you reside in these locations or are willing to relocate.