Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Data Engineer, Knowledge Graphs

Mithrl

Salary not specified

Dec 30, 2025

San Francisco, CA, US

Mithrl is building the world's first commercially available AI Co-Scientist, a discovery engine that transforms messy biological data into insights in minutes. The Data Engineer, Knowledge Graphs role is crucial for building the infrastructure that powers Mithrl's biological knowledge layer, bridging biological knowledge ingestion with high-performance engineering systems.

Requirements

Strong experience as a data engineer or backend engineer working with data intensive systems
Experience building ETL or ELT pipelines for large structured or semi structured datasets
Strong understanding of database design, schema modeling, and data architecture
Experience with graph data models or willingness to learn graph storage concepts
Proficiency in Python or similar languages for data engineering
Experience designing and maintaining APIs for data access
Understanding of versioning, provenance, validation, and reproducibility in data systems

Responsibilities

Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources
Design, implement, and evolve schemas and storage models for graph structured biological data
Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics
Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings
Build data models that support multi tenant access, versioning, and reproducibility across releases
Implement scalable storage and indexing strategies for high volume graph data
Maintain data quality, validate data integrity, and build monitoring around ingestion and usage

Other

Strong communication skills and ability to work closely with scientific and engineering teams
Experience with cloud infrastructure and modern data stack tools
Experience with graph databases or graph query languages
Experience with biological or chemical data sources
Familiarity with ontologies, controlled vocabularies, and metadata standards