Cartesia is looking to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text—1B text tokens, 10B audio tokens and 1T video tokens—let alone do this on-device. Cartesia is pioneering the model architectures that will make this possible.
Requirements
- Comfortable navigating complex machine learning codebases.
- Deep machine learning background, including a strong grasp of fundamentals in sequence modeling, generative models and common model architecture families (RNNs, CNNs, Transformers).
- Experienced model trainer, ideally previously wrote and pretrained large-scale models.
- Proficient in Python and Pytorch (or similar framework) and tensor programming more broadly.
- Familiarity with efficiency tradeoffs in designing model architectures for accelerators such as GPUs.
- Prior research experience in advancing state space models or implementing them in practice.
- Experience in optimizing model inference with CUDA, Triton or other frameworks.
Responsibilities
- implement new model backbones, architectures and training algorithms,
- rapidly run and iterate on experiments and ablations,
- build training infrastructure that scales to massive multimodal datasets,
- stay up-to-date on new research ideas.
Other
- in collaboration with a variety of machine learning, data and systems engineering stakeholders.
- Pursuing advanced degrees in machine learning (MS/PhD). Regardless of background, consider applying if you have strongly relevant experience.
- We don't offer part-time or remote internships.
- We’re an in-person team based out of San Francisco.
- Relocation assistance.