Sesame is looking to solve the problem of creating lifelike computers that can see, hear, and collaborate naturally with humans, focusing on making voice companions a part of daily life through vision-based interaction on resource-constrained devices.
Requirements
- Proven experience developing and deploying computer vision ML models on resource-constrained or embedded devices.
- Proficiency in C/C++ as well as Python, with strong experience in deep learning frameworks such as PyTorch, TensorFlow, or Jax.
- Hands-on experience with end-to-end ML workflows, from data capture to on-device deployment.
- Strong grasp of signal processing, vision-based feature extraction, and/or time-series analysis.
- Experience with wearables, cameras, IMUs, or tactile/force sensors.
- Familiarity with synthetic data generation and augmentation techniques.
- Track record of optimizing algorithms for power, latency, and memory footprint.
Responsibilities
- Design, train, and deploy algorithms for computer vision on low-power embedded hardware.
- Adapt and compress larger ML models to fit power, memory, and latency constraints of real-time wearable systems.
- Own the full ML development cycle: system design, data collection & curation, synthetic data generation, model training & evaluation, and on-device optimization.
- Collaborate closely with electrical, mechanical, firmware, and product teams to co-design algorithms and hardware in tandem.
- Pick promising approaches from the literature to bet on, and create new approaches where necessary to achieve our unique goals.
Other
- 5 years of experience in Software Engineering, ML Research, or related fields.
- Experience working with a high degree of autonomy in ambiguous environments.
- Excellent communication skills and the ability to work collaboratively across disciplines.
- Experience in a startup or fast-moving product environment.
- Experience deploying models in products.