Series B backed multimodal AI lab is looking to bridge the gap between human and machine communication by developing a real-time conversational video interface
Requirements
- Expert in Python with extensive experience in asynchronous frameworks, multiprocessing, and low-level system concepts
- Proven track record of shipping polished, reliable software in ambiguous, fast-paced environments where the state-of-the-art evolves rapidly
- Strong communicator who can simplify complex technical concepts
- Experience with LLM frameworks or WebRTC video streaming
- Extensive experience in asynchronous frameworks
- Experience with multiprocessing and low-level system concepts
- Proven track record of optimizing system performance
Responsibilities
- Own the delivery of core features including voice localization, sentence endpointing, and naturalness optimization for real-time video
- Partner with research teams to integrate sophisticated multimodal models into a reliable, high-uptime production codebase
- Optimize system performance by centralizing inter-process communication and shaving latency off utterance turn-taking for smoother conversations
- Lead the technical development of a real-time conversational video interface
- Collaborate with research teams to integrate state-of-the-art simulations into production-ready software for global enterprise applications
- Optimizing multimodal models for low latency, multilingual support, and natural interaction
- Build AI humans that see, hear, and respond in real-time
Other
- Strong communicator who can simplify complex technical concepts
- Ability to work in a high-impact environment
- Ability to shape architecture and ship features that reach millions of users across multiple languages
- Ability to work in a fast-paced environment
- Degree requirements not specified