Mercor is collaborating with a leading AI lab to engage generalist experts for a short-term failure mode evaluation project to ensure safer and more reliable AI deployment.
Requirements
- Strong analytical and problem-solving skills
- Ability to follow structured evaluation protocols with high attention to detail
- Experience in technical analysis, research, or related domains preferred
Responsibilities
- Conduct systematic evaluations of AI model outputs across varied scenarios
- Identify, categorize, and document potential failure modes and edge cases
- Provide clear, structured feedback to inform model safety improvements
Other
- Independent contractor status
- Excellent written communication and clarity in documenting findings
- Remote and asynchronous — contractors set their own hours
- Estimated commitment: 10–20 hours/week.
- Competitive hourly rate, adjusted for geography