AssemblyAI is looking for a Senior Research Engineer to optimize and scale their large-scale distributed training, data processing, and inference systems for Speech AI models, ensuring customers stay at the cutting edge of AI technology.
Requirements
- Expert-level proficiency with JAX and its ecosystem (Flax, Optax, XLA compilation pipeline).
- Strong experience optimizing inference systems for production, ideally with LLMs or speech models.
- Hands-on experience with TPU programming and optimization; GPU/CUDA expertise is also valuable.
- Familiarity with modern inference optimization techniques: continuous batching, KV-cache management, sharding strategies, quantization.
- Domain knowledge in Speech-to-Text (ASR architectures, audio processing, streaming inference) is a plus.
- Strong Python skills; C++ or Rust experience for kernel-level work is a plus.
- Deep understanding of distributed training at scale and ML infrastructure best practices.
Responsibilities
- Maintain and evolve our JAX training framework, ensuring scalability and efficiency for large-scale distributed training runs.
- Optimize production JAX inference systems for speech-to-text models using advanced techniques like continuous batching, model sharding, paged attention, and quantization.
- Refactor and modernize model architectures and infrastructure, translating research prototypes into production-ready systems.
- Investigate and resolve performance bottlenecks across the stack—from low-level kernels (XLA, Pallas) to high-level system design.
- Design and deploy scalable, distributed workloads optimized for TPU and GPU architectures.
- Bridge Research and Engineering teams, ensuring seamless knowledge transfer and alignment on technical priorities.
Other
- This is a cross-functional role requiring strong technical rigor, attention to detail, intellectual curiosity, and excellent communication skills.
- Excellent communication skills and a collaborative mindset—you can clearly explain complex tradeoffs and prioritize high-impact work.
- Passionate for refactoring and improving existing systems—you thrive on making code faster, cleaner, and more maintainable.
- Remote role open to candidates across Europe.
- Ambitious, curious, and lead with integrity.