Deepgram is looking to solve the challenges of current sequence modeling paradigms in voice AI, which cannot deliver voice AI capable of universal human interaction due to fundamental data problems posed by audio.
Requirements
3+ years of experience in applied deep learning research, with a solid understanding toward the applications and implications of different neural network types, architectures, and loss mechanism
Proven experience working with large language models (LLMs) - including experience with data curation, distributed large-scale training, optimization of transformer architecture, and RL Learning
Strong experience coding in Python and working with Pytorch
Experience with various transformer architectures (auto-regressive, sequence-to-sequence.etc)
Experience with distributed computing and large-scale data processing
Prior experience in conducting experimental programs and using results to optimize models
Responsibilities
Brainstorming and collaborating with other members of the Research Staff to define new LLM research initiatives
Broad surveying of literature, evaluating, classifying, and distilling current methods
Designing and carrying out experimental programs for LLMs
Driving transformer (LLM) training jobs successfully on distributed compute infrastructure and deploying new models into production
Documenting and presenting results and complex technical concepts clearly for a target audience
Staying up to date with the latest advances in deep learning and LLMs, with a particular eye towards their implications and applications within our products
Other
Strong communication skills and are able to translate complex concepts clearly
Highly analytical and enjoy delving into detailed analyses when necessary
Passionate about AI and excited about working on state of the art LLM research
Interest in producing and applying new science to help us develop and deploy large language models