Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Anthropic Logo

Research Scientist/Engineer, Honesty

Anthropic

$280,000 - $425,000
Aug 13, 2025
San Francisco, CA, US
Apply Now

Developing techniques to minimize hallucinations and enhance truthfulness in language models.

Requirements

  • Strong programming skills in Python
  • Industry experience with language model finetuning and classifier training
  • Proficiency in experimental design and statistical analysis for measuring improvements in calibration and accuracy
  • Experience in data science or the creation and curation of datasets for finetuning LLMs
  • Understanding of various metrics of uncertainty, calibration, and truthfulness in model outputs
  • Published work on hallucination prevention, factual grounding, or knowledge integration in language models
  • Experience with retrieval-augmented generation (RAG) or similar fact-grounding techniques

Responsibilities

  • Design and implement novel data curation pipelines to identify, verify, and filter training data for accuracy given the model’s knowledge
  • Develop specialized classifiers to detect potential hallucinations or miscalibrated claims made by the model
  • Create and maintain comprehensive honesty benchmarks and evaluation frameworks
  • Implement search and retrieval-augmented generation (RAG) systems to ground model outputs in verified information
  • Design and deploy human feedback collection specifically for identifying and correcting miscalibrated responses
  • Design and implement prompting pipelines to generate data that improves model accuracy and honesty
  • Develop and test novel RL environments that reward truthful outputs and penalize fabricated claims

Other

  • MS/PhD in Computer Science, ML, or related field
  • At least a Bachelor's degree in a related field or equivalent experience
  • Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time
  • Visa sponsorship: We do sponsor visas
  • Communication skills