Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Head of Evaluation and Oversight Research

Scale AI

$260,000 - $350,000

Sep 5, 2025

San Francisco, CA, US • Seattle, WA, US • New York, NY, US

Scale is looking to advance the science of evaluating and characterizing large language models (LLMs) by tackling hard problems in scalable oversight and advanced AI capabilities.

Requirements

Track record of impactful research in machine learning, especially in generative AI, evaluation, or oversight.
Publications at major ML/AI conferences (e.g. NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR) and/or journals.

Responsibilities

Lead a team of research scientists and engineers on foundational work in evaluation and oversight.
Drive research initiatives on frameworks and benchmarks for frontier AI models, spanning reasoning, coding, multi-modal, and agentic behaviors.
Design and advance scalable oversight methods, leveraging model-assisted evaluation, rubric-guided judgments, and recursive oversight.
Collaborate with leading research labs across industry and academia.
Publish research at top-tier venues and contribute to open-source benchmarking initiatives.
Remain deeply engaged with the research community, both understanding trends and setting them.
Developing AI-assisted evaluation pipelines, where models help critique, grade, and explain outputs (e.g. RLAIF, model-judging-model).

Other

Significant experience leading ML research in academia or industry.
Strong written and verbal communication skills for cross-functional collaboration.
Experience building and mentoring teams of research scientists and engineers.
The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training.
Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval.