LILT is looking for a Senior Research Engineer to design, develop, and productionize cutting-edge OCR and ASR systems leveraging open-source models to solve complex real-world challenges, including poor-quality audio and noisy, uncontrolled recording environments.
Requirements
- Strong proficiency in Python and experience with popular OSS machine learning frameworks (e.g., PyTorch, TensorFlow).
- Hands-on experience with ASR/OCR open-source toolkits (e.g., Kaldi, Vosk, Whisper, Tesseract).
- Deep understanding of speech signal processing and noise-robust ASR techniques.
- Familiarity with speaker diarization, source separation, and audio preprocessing methods.
- Experience in deploying production-grade ML systems at scale.
- Strong problem-solving skills and ability to work with ambiguous, noisy datasets.
- Contributions to OSS speech/OCR projects.
Responsibilities
- Design and implement production-grade prototypes for OCR and ASR systems based on open-source models.
- Build ASR systems resilient to severe audio challenges such as low/high volume, distortions, overlapping speech, unknown speaker counts, and off-axis microphone placement.
- Fine-tune and post-train ASR and OCR models for domain-specific accuracy improvements.
- Develop robust OSS-based solutions for accurately identifying speakers in multi-speaker environments.
- Create systems capable of detecting both speech and environmental noises (e.g., rattling keys, doors closing) within recordings.
- Implement pipelines to isolate and export individual speaker audio into separate files.
- Engineer accurate segmentation of speech into logical and temporal units for downstream processing.
Other
- Authorization to work in the US and/or Germany is a precondition of employment.
- This position can be based out of our Berlin, Germany office and will be expected to work in the office in a hybrid capacity.
- Additional locations include the Washington D.C. metropolitan area where you will start as fully remote and then transition to hybrid once offices are opened in those locations.