Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Anthropic Logo

Machine Learning Engineer, Safeguards

Anthropic

$340,000 - $425,000
Aug 12, 2025
San Francisco, CA, US
Apply Now

The company is looking to build safety and oversight mechanisms for its AI systems to detect harmful behaviors and ensure user well-being.

Requirements

  • Proficiency in SQL, Python, and data analysis/data mining tools
  • Proficiency in building trust and safety AI/ML systems, such as behavioral classifiers or anomaly detection
  • Machine learning frameworks like Scikit-Learn, TensorFlow, or PyTorch
  • High-performance, large-scale ML systems
  • Language modeling with transformers
  • Reinforcement learning
  • Large-scale ETL

Responsibilities

  • Build machine learning models to detect unwanted or anomalous behaviors from users and API partners, and integrate them into our production system
  • Improve our automated detection and enforcement systems as needed
  • Analyze user reports of inappropriate accounts and build machine learning models to detect similar instances proactively
  • Surface abuse patterns to our research teams to harden models at the training stage

Other

  • At least a Bachelor's degree in a related field or equivalent experience
  • Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time
  • Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate
  • Strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
  • Care about the societal impacts and long-term implications of your work