Axle is seeking a Data Engineer to support biomedical science and data analysis projects at the National Cancer Institute (NCI) by building and optimizing data pipelines within the Palantir Foundry environment.
Requirements
- Strong proficiency in Python and SQL for data manipulation and scripting
- Hands-on experience building ETL processes or data pipelines to handle large datasets
- Familiarity with big data processing (e.g. using Spark/PySpark) for scalable data transformations
- Knowledge of software engineering and data engineering best practices – version control (Git), code review, testing, and documentation
- Hands-on experience with Palantir Foundry is a strong plus
- Familiarity with Foundry components such as Ontology, Code Workbooks, Functions, Foundry Pipelines (Pipeline Builder), Foundry Dashboarding, or Object Builder
Responsibilities
- Design, build, and maintain data pipelines in Palantir Foundry to ingest, transform, and integrate diverse biomedical data sources
- Develop transformations and workflows using Foundry’s tools to prepare high-quality data for researchers
- Define and manage the Foundry Ontology and object models to represent biomedical entities and relationships
- Implement data validation checks and follow best practices for data governance
- Create or support interactive Foundry dashboarding solutions for researchers to visualize and explore data
- Integrate Foundry data with external applications or public-facing web pages using Palantir Foundry’s APIs or tools
Other
- Bachelor’s degree in Computer Science, Data Science, Bioinformatics, or a related field (or equivalent practical experience)
- Proven experience as a data engineer or in a similar data-intensive role, preferably supporting analytics or research teams
- Excellent problem-solving skills and the ability to communicate effectively with both technical and non-technical stakeholders
- Comfortable working in an interdisciplinary environment with biomedical researchers, and capable of translating domain needs into technical solutions
- Interest in biomedical science and healthcare data
- Ability to quickly learn domain-specific concepts and handle sensitive research data in compliance with regulatory or privacy requirements