Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Tech Lead Manager, Machine Learning Research Scientist- LLM Evals

Scale AI

$260,000 - $350,000

Sep 5, 2025

San Francisco, CA, US • Seattle, WA, US • New York, NY, US

Scale is looking to advance the evaluation and benchmarking of large language models (LLMs) by developing industry-leading LLM evals and setting new standards for model performance assessment. The Tech Lead Manager will lead a team focused on developing and implementing novel evaluation methodologies, metrics, and benchmarks to assess LLM capabilities and limitations, driving the next generation of AI capabilities in partnership with top foundational model labs.

Requirements

5+ years of hands-on experience in large language model, NLP, and Transformer modeling, in the setting of both research and engineering development
Published research in areas of machine learning at major conferences (NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, etc.) and/or journals

Responsibilities

Conduct research on the effectiveness and limitations of existing LLM evaluation techniques.
Design and develop novel evaluation benchmarks for large language models, covering areas such as instruction following, factuality, robustness, and fairness.
Implement scalable and reproducible evaluation pipelines using modern ML frameworks.
Remain up-to-date on ongoing research in the team, help work through technical challenges, and be involved in design decisions
Lead a team of highly effective research scientists and research engineers on LLM evals.
Publish research findings in top-tier AI conferences and contribute to open-source benchmarking initiatives.
Collaborate with internal teams and external partners to refine metrics and create standardized evaluation protocols.

Other

Lead a team of highly effective research scientists and research engineers on LLM evals.
Communicate, collaborate, and build relationships with clients and peer teams to facilitate cross-functional projects.
Experience supporting and leading a team of research scientists and research engineers
Excellent written and verbal communication skills
Previous experience in a customer facing role.