Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Thomson Reuters Logo

Senior Applied Scientist, NLP/GenAI

Thomson Reuters

$126,000 - $269,600
Oct 24, 2025
Frisco, TX, United States of America
Apply Now

Thomson Reuters is looking to solve complex document understanding tasks in the legal domain to power its AI platform, enhance products like Westlaw and CoCounsel, and improve how legal professionals research, analyze, and reason over legal documents.

Requirements

  • Deep understanding of document understanding fundamentals: document layout analysis, semantic chunking approaches beyond fixed-size or paragraph-based methods, document classification handling hierarchical taxonomies, imbalanced multi-label classification, and adapting to domain-specific schemas
  • Expertise in knowledge extraction and knowledge graph construction: entity recognition and linking, relation extraction, citation parsing, and building graph representations from unstructured text
  • Expertise in LLM-based information extraction, few-shot and multi-task learning, post-training and knowledge distillation
  • Solid understanding of synthetic data generation techniques for NLP, including query - answer generation with verification and scalable data augmentation for training specialized models
  • Solid understanding of efficiency optimization including knowledge distillation, model compression, and designing SLM-based solutions that balance performance with computational constraints
  • Solid understanding of DL/ML approaches used for NLP tasks
  • Experience designing annotation workflows, creating high-quality labeled datasets with clear guidelines, and developing evaluation frameworks for document understanding tasks

Responsibilities

  • Design, build, test, and deploy end-to-end AI solutions for complex document understanding tasks in the legal domain.
  • Develop advanced models for semantic chunking of lengthy, non-uniformly structured legal documents with adjustable granularity levels for different use cases.
  • Build document enrichment systems that classify documents according to legal and customer-defined taxonomies and extract rich metadata.
  • Create LLM-based knowledge graph construction pipelines that extract and link heterogeneous legal knowledge including citations, entities, and legal concepts across diverse legal content.
  • Develop scalable synthetic data generation systems to support model training, simulate complex legal research queries and generate hallucination-free answers.
  • Work in collaboration with engineering to ensure well-managed software delivery and reliability at scale.
  • Develop comprehensive data and evaluation strategies for both component-level and end-to-end quality, leveraging expert human annotation and synthetic data generation.

Other

  • PhD in Computer Science, AI, NLP, or a related field, or a Master's with equivalent research/industry experience
  • 5+ years of hands-on experience building and deploying document understanding systems, information extraction pipelines, or knowledge graph construction using deep learning, LLMs and NLP methods
  • Proven ability to translate complex document understanding problems into innovative AI applications that balance accuracy and efficiency
  • Professional experience scaling yourself and leading through others, in an applied research setting
  • Publications at relevant venues such as ACL, EMNLP, ICLR, NeurIPS, SIGIR, KDD