Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Applied Researcher, Audio Understanding

Cartesia

Salary not specified

Sep 16, 2025

San Francisco, CA, US

Cartesia aims to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. The current limitation is that even the best models cannot continuously process and reason over a year-long stream of audio, video, and text, especially on-device. This role focuses on tackling challenging problems in audio perception to advance this mission.

Requirements

Deep expertise in ASR, audio understanding, language modeling, or generative modeling more broadly.
Experience with large-scale training, GPU/TPU acceleration, and model optimization.

Responsibilities

Architect and develop novel, large-scale models for complex audio understanding tasks, including multi-speaker ASR, diarization, and non-speech audio classification and deploy them to production at scale.
Pioneer research in areas like self-supervised learning for audio, few-shot learning, and robust audio-visual perception.
Set new standards for how we evaluate and benchmark our audio understanding systems.
Build large scale pre-training and fine-tuning datasets for audio understanding capabilities.

Other

Strong applied mindset—able to balance scientific novelty with product impact.
We’re an in-person team based out of San Francisco.
Relocation and immigration support.