At Serve Robotics, the business problem is to reimagine how things move in cities by developing a personable sidewalk robot that can take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses.
Requirements
- Strong experience with Reinforcement Learning (PPO, SAC, A3C, DQN, multi-agent RL, or equivalents).
- Hands-on experience with distributed training frameworks (Ray RLlib, Accelerate, PyTorch Distributed, Kubernetes, or similar).
- Proficiency in Python and C++ for performance-critical simulation or graphics pipelines.
- Experience building or modifying simulation environments (Isaac Sim, Unity, Unreal, CARLA, Gazebo, MuJoCo or custom engines).
- Experience with procedural generation (noise functions, rule-based systems, agent scripts, behavior trees).
- Experience with GPU compute, containers, and cloud infrastructure.
- Background in generative AI (diffusion, LLMs) for scenario synthesis or environment creation.
Responsibilities
- Develop RL algorithms that can help with terrain intelligence and social navigation behaviors.
- Design, build, and optimize large-scale RL training pipelines (distributed compute, GPU clusters, containerized workflows).
- Implement curriculum learning, domain randomization, and multi-agent RL strategies.
- Optimize RL model performance, sample efficiency, and stability across thousands to millions of simulation steps.
- Build automated tools for experiment orchestration, rollout collection, and metrics visualization.
- Develop procedural generation pipelines for synthetic environments, agents, and dynamic behaviors.
- Build tools to generate long-tail scenarios, sudden appearance of objects, traffic behaviors, rare events, and environmental variations.
Other
- Master’s degree in Robotics, AI, Computer Science, Mathematics, or a related field.
- 7+ years of professional experience with shipping transformer based AI models handling complex navigation or manipulation tasks in AV or robotics solutions at scale in the real world.
- 3+ years technical leadership/architecture experience
- Collaborate with autonomy, ML, and safety teams to map real-world failures into repeatable synthetic simulation cases.
- Document tools, frameworks, and workflows for internal users.