Nuro is looking to advance its autonomous driving systems by creating a scalable and reliable data infrastructure to produce training and evaluation data.
Requirements
- Strong proficiency in Python or similar languages
- Experience working with large-scale data and building scalable & reliable systems/data pipelines
- Strong proficiency in C++ or other high-performance low-level languages
- Knowledge of GCP, GCS, BigQuery, or PostgreSQL
- Knowledge of data engineering, and its tooling and best practices
- Knowledge of batch and streaming data processing, warehousing, and analytics solutions
- Experience working with large-scale distributed data systems
Responsibilities
- Design and develop unified, introspectable, large-scale batch and streaming data pipelines
- Create and implement a storage system capable of accommodating both the large volume and diverse range of evaluation and performance metrics
- Construct intuitive dashboards and reports to present evaluation results
- Develop and maintain continuous testing and monitoring systems to guarantee the integrity and resilience of our data and associated data pipelines
- Develop data mining tools with applied ML techniques to support data discovery needs
- Develop data annotation tools to support first-party and third-party labeling workforce
- Scale data annotation labels with applied State-of-the-art ML techniques
Other
- Degree in BS, MS.c or Ph.D, plus 4 years of relevant work experience
- A bachelor's degree in Computer Science, Electrical Engineering, or a closely related field
- Experience setting team or project product and technical vision, timelines, and prioritization
- Ability and willingness to deep dive into implementation, driving technical standards and best practices across broader software organization
- Mentoring and support junior engineers