Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Anthropic Logo

Research Engineer / Scientist, Alignment Science

Anthropic

$280,000 - $690,000
Nov 5, 2025
Remote, US
Apply Now

Anthropic is looking to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society.

Requirements

  • Significant software, ML, or research engineering experience
  • Experience contributing to empirical AI research projects
  • Familiarity with technical AI safety research
  • Experience authoring research papers in machine learning, NLP, or AI safety
  • Experience with LLMs
  • Experience with reinforcement learning
  • Experience with Kubernetes clusters and complex shared codebases

Responsibilities

  • Testing the robustness of safety techniques by training language models to subvert safety techniques
  • Running multi-agent reinforcement learning experiments to test techniques like AI Debate
  • Building tooling to efficiently evaluate the effectiveness of novel LLM-generated jailbreaks
  • Writing scripts and prompts to efficiently produce evaluation questions to test models’ reasoning abilities in safety-relevant contexts
  • Contributing ideas, figures, and writing to research papers, blog posts, and talks
  • Running experiments that feed into key AI safety efforts at Anthropic

Other

  • Bachelor's degree in a related field or equivalent experience
  • Ability to be based in the Bay Area (or travel 25% to the Bay Area)
  • Ability to pick up slack and contribute to collaborative projects
  • Care about the impacts of AI
  • Strong communication skills