May Mobility is seeking talented data scientists and machine learning engineers to develop automated methods for tagging data collected by their autonomous vehicles. This will enable them to generate valuable insights from their data, making it easily searchable for triaging issues, creating test sets, and building datasets for autonomy improvements, ultimately impacting development and business decisions.
Requirements
- Expert proficiency in designing and implementing deep learning architectures for combining visual data with sequential or time-series data streams for offline analysis.
- Strong understanding of data labeling best practices, label consistency, and performance metrics specifically relevant to large-scale auto-tagging accuracy and dataset curation.
- Expertise in machine learning, with hands-on experience in the design, training, and evaluation of a wide range of algorithms.
- Awareness of the latest advancements in the field, with the ability to translate innovative concepts into practical solutions for May.
- Excellent problem-solving skills with a meticulous approach to model architecture and optimization.
- 5+ years of hands-on experience as a Data Scientist or ML Engineer with a strong focus on algorithm design and machine learning.
- Expert-level programming skills in Python with extensive use of modern deep learning frameworks like TensorFlow or PyTorch.
Responsibilities
- Design, implement, and deploy state-of-the-art machine learning models for analyzing multimodal data to generate searchable metadata and facilitating downstream engineering workflows such as quick issue triaging.
- Research and implement novel techniques for sequential feature extraction, weak supervision, and self-supervised learning to efficiently handle long-tail events and continuously improve labeling data quality.
- Curate high-quality datasets by applying advanced data mining techniques to ensure model robustness, performance, and coverage.
- Establish and maintain frameworks for model validation and performance monitoring to drive continuous improvement.
Other
- B.S, M.S. or Ph.D. Degree in Engineering, Data Science, Computer Science, Math, or a related quantitative field.
- Experience working with multimodal data like visual data (images/video), structured perception and behavior outputs (e.g., agent tracks, vehicle state estimation, motion planner outputs).
- Demonstrated experience in building and deploying production-level machine learning systems from conception to delivery.
- Expertise in PySpark/Apache Spark for handling large-scale data processing.
- Background in robotics or autonomous systems.