Cartesia is looking to build the next generation of AI that can continuously process and reason over large streams of audio, video, and text, even on-device. The Post-Training team is focused on developing methods and systems to make multimodal models adaptive, aligned, and grounded in human intent, addressing the need for better post-training and alignment beyond just model scale.
Requirements
- Deep knowledge of preference optimization and alignment methods, including RLHF and related approaches
- Experience designing evaluations and metrics for generative or multimodal models
- Strong engineering and debugging skills, with experience building or scaling complex ML systems
- Ability to trace and diagnose complex behaviors in model performance across the training and evaluation pipeline
- Experience with multimodal model training (e.g., text, audio, or vision-language models)
- Contributions to alignment research or open-source projects related to model evaluation or fine-tuning
- Background in designing or implementing human-in-the-loop evaluation systems
Responsibilities
- Own research initiatives to improve the alignment and capabilities of multimodal models
- Develop new post-training methods and evaluation frameworks to measure model improvement
- Implement, debug, and scale experimental systems to ensure reliability and reproducibility across training runs
- Translate research findings into production-ready systems that enhance model reasoning, consistency, and human alignment
- Design new techniques for preference optimization, model evaluation, and feedback-driven learning
- Explore how feedback signals can guide models to reason more effectively across modalities
- Build the infrastructure to measure and improve these behaviors at scale
Other
- Partner closely with research, product, and platform teams to define best practices for creating specialized models
- We’re an in-person team based out of San Francisco.
- We ship fast.
- We have a high bar, and we don’t sacrifice quality and design along the way.
- We support each other.