Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Research Engineer / Scientist, Alignment Science

$280,000 - $690,000

Nov 5, 2025

Remote, US

Anthropic is looking to create reliable, interpretable, and steerable AI systems that are safe and beneficial for users and society.

Testing the robustness of safety techniques by training language models to subvert safety techniques
Running multi-agent reinforcement learning experiments to test techniques like AI Debate
Building tooling to efficiently evaluate the effectiveness of novel LLM-generated jailbreaks
Writing scripts and prompts to efficiently produce evaluation questions to test models’ reasoning abilities in safety-relevant contexts
Contributing ideas, figures, and writing to research papers, blog posts, and talks
Running experiments that feed into key AI safety efforts at Anthropic