NVIDIA's Retriever team is developing the next generation of retrieval pipelines for RAG, focusing on ingesting modalities beyond text, and building the framework upon which production RAG systems are based.
Requirements
- Hands-on experience developing computer vision models and pipelines, with preference for document-focused tasks such as layout analysis, table or figure detection, and OCR.
- An understanding of the state of the art in retrieval research, with a focus on multimodal content retrieval.
- 10+ years of experience developing multimodal systems across a range of models and platforms.
- Knowledge of best practices in batching, streaming, and scaling of ingestion pipelines to support real-world applications.
- Excellent Python programming skills and a strong understanding of the Python deep learning ecosystem (PyTorch, Tensorflow, MXNet, etc).
Responsibilities
- Working with our team of researchers to develop efficient and performant models and pipelines that extract text content from images, video, audio and other modalities.
- Building vision pipelines for document ingestion, including page layout analysis, object detection, and OCR.
- Exploring and crafting datasets, metrics, experiments, and validation scripts to develop standard methodologies for research.
- Helping ML Engineers scale pipelines to production capability through the development of NVIDIA Inference Microservices (NIMs) and blueprints which demonstrate how to deploy NIMs in a pipeline effectively.
- Writing papers, blog posts, documentation and trainings that help customers understand and take advantage of our research.
- Keeping up to date with the latest developments in Retrieval across academia and industry.
Other
- Candidates with a Master's, Ph.D. or equivalent experience in retrieval or multimodal research are preferred, along with a track record of publication in leading conferences like CVPR, ICCV, ECCV, KDD, etc.
- Competitive results in computer vision competitions on Kaggle or similar platforms is a plus.
- Information retrieval experience is a big plus.
- An ability to share and communicate your ideas clearly through blog posts, papers, kernels, GitHub, etc.
- Strong communication and interpersonal skills are essential, as well as the capability to collaborate within a dynamic, distributed team.
- A history of mentoring junior engineers and interns is a plus.