Get Jobs Tailored to Your Resume
Scale is looking to advance the evaluation and benchmarking of large language models (LLMs) by developing industry-leading LLM evals and setting new standards for model performance assessment. The Tech Lead Manager will lead a team focused on developing and implementing novel evaluation methodologies, metrics, and benchmarks to assess LLM capabilities and limitations, driving the next generation of AI capabilities in partnership with top foundational model labs.