Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Founding Data Engineer

Elicit

$185,000 - $305,000

Sep 24, 2025

Oakland, CA, US

Elicit is an AI research assistant that uses language models to help professional researchers and high-stakes decision makers break down hard questions, gather evidence from scientific/academic sources, and reason through uncertainty. Elicit aims to radically increase the amount of good reasoning in the world and be a scalable ML system based on human-understandable task decompositions.

Requirements

Strong proficiency in Python (5+ years experience)
Experience with architecting and optimizing large data pipelines, ideally with particular experience with Spark
Strong SQL skills, including understanding of aggregation functions, window functions, UDFs, self-joins, partitioning, and clustering approaches
Experience with columnar data storage formats like Parquet
Experience with distributed computing frameworks beyond Spark (e.g., Dask, Ray)
Hands-on experience with industry standard tools like Airflow, DBT, or Hadoop
Hands-on experience with standard paradigms like data lake, data warehouse, or lakehouse

Responsibilities

Build a complete corpus of academic papers and clinical trials, available as soon as they're published, combining different data sources and ingestion methods.
Figure out the best way to ingest massive amounts of heterogeneous data in such a way as to make it usable by LLMs.
Integrate into our customers' custom data providers to that they can create task-specific workflows over them.
Architect and implement robust, scalable solutions to handle our growing data needs while maintaining high performance and data quality.
Building and optimizing our academic research paper pipeline
Expanding the datasets Elicit works over
Data for our ML systems

Other

5+ years of experience as a data engineer: owning make-or-break decisions about how to ingest, manage, and use data
You have created and owned a data platform at rapidly-growing startups—gathering needs from colleagues, planning an architecture, deploying the infrastructure, and implementing the tooling
Strong opinions, weakly-held about approaches to data quality management
Creative and user-centric problem-solving
You should be excited to play a key role in shipping new features to users—not just building out a data platform!
spend about 1 week out of every 6 with teammates
Flexible work environment: work from our office in Oakland or remotely with time zone overlap (between GMT and GMT-8), as long as you can travel for in-person retreats and coworking events