Backflip is looking for a Senior Data Engineer to design, build, and maintain the systems that power their data workflows, focusing on making 3D CAD data and other structured/unstructured data sources usable for ML research and production environments.
Requirements
- Strong experience building and maintaining batch ETL pipelines
- Proficiency with Python and modern data tooling (Airflow, dbt, Spark, or equivalent)
- Familiarity with Docker-based workflows and distributed cloud infrastructure (AWS, Modal, or similar)
- Experience designing and optimizing data storage and query systems at scale
- Exposure to ML workflows and preparing data for AI/ML systems
- Experience working with 3D or CAD data formats
Responsibilities
- Design, build, and manage large-scale data pipelines and surrounding infrastructure
- Partner with ML researchers to deliver high-quality data for training and evaluation
- Process and interpret 3D CAD data for downstream ML and engineering applications
- Build systems that scale from millions to hundreds of millions of data points
- Develop tools for structured logging, monitoring, and querying large datasets
- Collaborate with DevOps to ensure robust, secure, and automated data workflows
Other
- 5+ years of professional experience in data engineering or related fields
- Strong problem-solving skills with a track record of building robust, scalable systems
- Ability to collaborate effectively across ML research, engineering, and cross-functional teams
- Hybrid role based in our San Francisco headquarters
- 3 days a week in our San Francisco office, 2 days remote with the option to come in more often