Anthropic is looking to develop reliable, interpretable, and steerable AI systems by enhancing their production models' capabilities, alignment, and safety through sophisticated post-training processes.
Requirements
- Strong software engineering skills with experience building complex ML systems
- Experience with large-scale distributed systems and high-performance computing
- Experience with training, fine-tuning, or evaluating large language models
- Proficiency in Python, deep learning frameworks, and distributed computing
- Experience with LLMs is a significant plus
Responsibilities
- Implement and optimize post-training techniques at scale on frontier models
- Design, build, and run robust, efficient pipelines for model fine-tuning and evaluation
- Develop tools to measure and improve model performance across various dimensions
- Collaborate with research teams to translate emerging techniques into production-ready implementations
- Debug complex issues in training pipelines and model behavior
- Help establish best practices for reliable, reproducible model post-training
Other
- At least a Bachelor's degree in a related field or equivalent experience
- Location-based hybrid policy: currently, we expect all staff to be in one of our offices at least 25% of the time
- Visa sponsorship: we do sponsor visas, but we aren't able to successfully sponsor visas for every role and every candidate
- Strong communication skills
- Ability to navigate ambiguity and make progress in fast-moving research environments
- Keen interest in AI safety and responsible deployment