Path Robotics is building the future of embodied intelligence with AI-driven systems to close the skilled labor gap and transform industries. The company needs to architect and build a data platform to enable their machine learning and robotics teams to move faster, experiment more effectively, and deploy AI solutions into production.
Requirements
- 5+ years of experience in data engineering, with a strong track record of building scalable data systems and infrastructure from the ground up.
- Previous experience supporting AI/ML teams with data platforms in fast-paced or startup environments.
- Deep proficiency in SQL, Python, and distributed data systems.
- Hands-on experience with tools like AWS, Snowflake, Dagster, dbt, or equivalent modern data stack.
- Experience with streaming technologies such as Kafka, Flink, or similar platforms for real-time data processing.
- Experience with data quality, observability, and production monitoring of data systems.
- Bonus: experience working with sensor data, robotics logs, time series, or other high-volume real-world datasets.
Responsibilities
- Architect and build our end-to-end data platform, including ingestion, transformation, orchestration, quality monitoring, and data access layers.
- Design and develop optimized database architecture and data schemas with strategic partitioning for large scale multi-modal data.
- Implement scalable ETL/ELT pipelines that serve experimentation, training, validation, and real-time deployment needs across robotics and AI systems.
- Collaborate with ML and robotics engineers to define and deliver the data infrastructure required for cutting-edge model development and real-world deployment.
- Establish data governance, reproducibility, and lineage standards to ensure data integrity and traceability at scale.
- Develop data contracts and API specifications to ensure reliable data exchange between robotics systems, ML models, and downstream applications.
- Write production-grade, maintainable, and well-documented Python and SQL code for critical data workflows.
Other
- Lead initiatives that break down data silos and enable seamless collaboration between business domains
- Help shape our MLOps, observability, and CI/CD practices for continuous learning systems in physical environments.
- Comfortable operating in ambiguity and taking ownership. You’re excited to build and own the foundation of a data-first company.
- Passionate about building clean, reliable data pipelines that power real-time ML and robotics workflows.