Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

NVIDIA Logo

Senior Applied Research Scientist, Multimodal Retrieval

NVIDIA

$224,000 - $356,500
Aug 29, 2025
Santa Clara, CA, US
Apply Now

NVIDIA's Retriever team is developing the next generation of retrieval pipelines for RAG, focusing on ingesting modalities beyond text, and building the framework upon which production RAG systems are based.

Requirements

  • Hands-on experience developing computer vision models and pipelines, with preference for document-focused tasks such as layout analysis, table or figure detection, and OCR.
  • An understanding of the state of the art in retrieval research, with a focus on multimodal content retrieval.
  • 10+ years of experience developing multimodal systems across a range of models and platforms.
  • Knowledge of best practices in batching, streaming, and scaling of ingestion pipelines to support real-world applications.
  • Excellent Python programming skills and a strong understanding of the Python deep learning ecosystem (PyTorch, Tensorflow, MXNet, etc).

Responsibilities

  • Working with our team of researchers to develop efficient and performant models and pipelines that extract text content from images, video, audio and other modalities.
  • Building vision pipelines for document ingestion, including page layout analysis, object detection, and OCR.
  • Exploring and crafting datasets, metrics, experiments, and validation scripts to develop standard methodologies for research.
  • Helping ML Engineers scale pipelines to production capability through the development of NVIDIA Inference Microservices (NIMs) and blueprints which demonstrate how to deploy NIMs in a pipeline effectively.
  • Writing papers, blog posts, documentation and trainings that help customers understand and take advantage of our research.
  • Keeping up to date with the latest developments in Retrieval across academia and industry.

Other

  • Candidates with a Master's, Ph.D. or equivalent experience in retrieval or multimodal research are preferred, along with a track record of publication in leading conferences like CVPR, ICCV, ECCV, KDD, etc.
  • Competitive results in computer vision competitions on Kaggle or similar platforms is a plus.
  • Information retrieval experience is a big plus.
  • An ability to share and communicate your ideas clearly through blog posts, papers, kernels, GitHub, etc.
  • Strong communication and interpersonal skills are essential, as well as the capability to collaborate within a dynamic, distributed team.
  • A history of mentoring junior engineers and interns is a plus.