Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

ML Infrastructure Engineering Manager, Safeguards

Anthropic

$340,000 - $425,000

Sep 12, 2025

San Francisco, CA, US

Anthropic needs an ML Infrastructure Engineering Manager to lead a team that builds and scales the systems powering their AI safety and trust mechanisms, ensuring their AI models operate safely and reliably at scale.

Requirements

4+ years of management experience leading technical teams focused on ML infrastructure, platform engineering, or distributed systems
8+ years of hands-on experience building production ML infrastructure, ideally in safety-critical domains like fraud detection, content moderation, or risk assessment
Possess deep technical knowledge of ML serving platforms, feature stores, data pipelines, and distributed systems architecture
Knowledge of modern ML frameworks, cloud platforms, and container orchestration in production environments
Experience implementing automated testing, deployment, and monitoring systems for ML models in production
Have managed teams working on real-time, high-throughput systems with strict latency and reliability requirements
Experience with compliance and security requirements for safety-critical applications

Responsibilities

Set team vision and roadmap for ML infrastructure that powers Anthropic's safety and trust systems, ensuring scalability, reliability, and performance at production scale
Lead a team of ML infrastructure and software engineers to build robust platforms supporting real-time safety evaluations, feature stores, model serving, and data pipelines
Partner with Safeguards, Security, Research, and Product teams to identify infrastructure requirements and translate complex safety research into scalable production systems
Drive technical strategy for ML infrastructure architecture, making key decisions about technology choices, system design, and platform evolution
Maintain deep technical expertise in ML infrastructure, distributed systems, and safety-critical applications to provide technical leadership and guidance
Collaborate across teams to ensure infrastructure supports rapid experimentation while maintaining production reliability and safety standards
Champion engineering best practices including automated testing, deployment pipelines, monitoring, and incident response for safety-critical systems

Other

Demonstrated ability to lead and manage high-performing technical teams through periods of rapid growth and scaling challenges
Show excellent communication skills in translating complex technical concepts for various audiences, from individual contributors to executive leadership
Have strong project management skills with the ability to balance multiple priorities and coordinate across cross-functional teams
Experience managing teams that bridge research and production, with a track record of productionizing experimental systems
Demonstrate passion for ensuring the responsible development and deployment of AI systems
We require at least a Bachelor's degree in a related field or equivalent experience.
Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.
We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.