Agility Robotics is looking for a Senior Data Engineer to help shape the foundation of data-driven operations at Agility Robotics by building and maintaining the data infrastructure that powers fleet operations, hardware reliability, business analytics, and machine learning.
Requirements
- 5+ years of experience as a Data Engineer or similar role building and maintaining production data pipelines.
- Strong proficiency in Apache Spark or equivalent distributed data processing frameworks.
- Experience with Airflow, Dagster, Prefect, or other data orchestration systems.
- Proficiency with data formats such as Avro, Parquet, and structured/numeric datasets.
- Solid understanding of data modeling, schema evolution, and data quality best practices.
- Good intuitions of how to model datasets logically and partition them physically for optimal query performance, both for analytical query engines and for playback or root-cause-analysis(e.g. ReRun, Foxglove etc)
- Strong programming skills in Python, Java and/or Scala.
- Experience with AWS data stack (S3, Glue, Athena, EMR, etc.) or similar cloud infrastructure.
- Experience working with vision data pipelines(e.g. Images, video, depth) and building derived datasets from them.
- Experience with robotics vision data (RGB, depth, point clouds, or perception outputs) and how to process, store, and query them efficiently.
- Familiarity with C++ and willingness to contribute to lightweight logging or data serialization libraries.
- Exposure to large-scale robotics data, including high-frequency and high-fidelity sensor, telemetry and vision streams.
- Experience with data catalog systems and metadata management.
- Familiarity with data versioning or immutable dataset design (e.g., Apache Iceberg, Delta Lake).
Responsibilities
- Collaborate with robot software and hardware teams to define, collect, and curate data needed for analytics and debugging.
- Develop and maintain ETL pipelines that transform raw robot logs and telemetry into structured datasets using Spark, Airflow (or equivalent orchestration tools), and AWS data services.
- Contribute to on-robot data production workflows to ensure high-fidelity, well-structured data capture.
- Design derived datasets and transformations across Avro, Parquet, and other sensor data formats to power fleet operations, reliability analysis, and business metrics.
- Implement data quality checks, schema evolution, and metadata management practices using our internal Data Registry and cataloging systems.
- Work closely with the ingestion and storage services that move robot data into the cloud (S3-based data lake).
- Collaborate with internal consumers — reliability, analytics, and ML teams — to design efficient data models for their workflows.
Other
- Comfort working cross-functionally with software, hardware, and analytics teams in a fast-paced environment
- This is a fully remote role with the option to work hybrid if a commutable distance from our Salem, Pittsburgh, or Bay Area offices.
- Agility Robotics is committed to a work environment in which all individuals are treated with respect and dignity.
- Agility Robotics prohibits any such discrimination or harassment.
- Agility Robotics does not accept unsolicited referrals from third-party recruiting agencies.