Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Software Engineer, Safeguards

Anthropic

$300,000 - $405,000

Oct 4, 2025

San Francisco, CA, US

Anthropic is looking to build safety and oversight mechanisms for their AI systems to prevent misuse and ensure user well-being.

Requirements

Proficiency in SQL
Proficiency in Python
Proficiency in data analysis tools
Experience building trust and safety mechanisms for AI/ML systems
Experience with machine learning frameworks like Scikit-Learn, Tensorflow, or Pytorch
Experience building machine learning models

Responsibilities

Develop monitoring systems to detect unwanted behaviors from API partners and potentially take automated enforcement actions
Build abuse detection mechanisms and infrastructure
Surface abuse patterns to research teams to harden models at the training stage
Build robust and reliable multi-layered defenses for real-time improvement of safety mechanisms that work at scale
Analyze user reports of inappropriate content or accounts

Other

Bachelor's degree in Computer Science, Software Engineering or comparable experience
3-10+ years of experience in a software engineering position
Strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
At least 25% of time in one of our offices
Visa sponsorship available