Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Research Engineer / Scientist, Tool Use Safety

Anthropic

$315,000 - $425,000

Sep 24, 2025

San Francisco, CA, US

Anthropic is looking to advance the frontier of safe tool use in their AI model, Claude. This involves addressing challenges like prompt injection robustness, data exfiltration through tool misuse, adversarial attacks in multi-turn conversations, and ensuring safety with autonomous agents operating with numerous tools over long horizons. The goal is to scale AI responsibly and make it more reliable, interpretable, and steerable.

Requirements

Experience with tool use/agentic safety, trust & safety, or security
Experience with reinforcement learning techniques and environments
Experience with language model training, fine-tuning or evaluation
Experience building AI agents or autonomous systems
Published influential work in relevant ML areas, especially around LLM safety & alignment
Deep expertise in a specialized area (e.g., RL, security, or mathematical foundations), even if still developing breadth in adjacent areas
Experience shipping features or working closely with product teams

Responsibilities

Design and implement novel and scalable reinforcement learning methodologies that push the state of the art of tool use safety
Define and pursue research agendas that push the boundaries of what's possible
Build rigorous, realistic evaluations that capture the complexity of real-world tool use safety challenges
Ship research advances that directly impact and protect millions of users
Collaborate with other safety research (e.g. Safeguards, Alignment Science), capabilities research, and product teams to drive fundamental breakthroughs in safety, and work with teams to ship these into production
Design, implement, and debug code across our research and production ML stacks
Contribute to our collaborative research culture through pair programming, technical discussions, and team problem-solving

Other

Passionate about our safety mission
Are driven by real-world impact and excited to see research ship in production
Have strong machine learning research/applied-research experience, or a strong quantitative background such as physics, mathematics, or quantitative finance research
Write clean, reliable code and have solid software engineering skills
Communicate complex ideas clearly to diverse audiences
Are hungry to learn and grow, regardless of years of experience
Enthusiasm for pair programming and collaborative research
Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time.
We do sponsor visas!