Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Advisory Engineer, AI Model Evaluation

Lenovo

$180,000 - $240,000

Aug 27, 2025

San Jose, CA, US

Lenovo's Advanced AI Technology Center (AAITC) is seeking an AI Model Evaluation Engineer to assess the performance, robustness, and safety of large language models (LLMs), large vision models (LVMs), and large multimodal models (LMMs) to contribute to cutting-edge research and development in generative AI and deploy them into innovative products.

Requirements

Strong programming skills in Python and experience with deep learning frameworks like PyTorch.
Deep understanding of machine learning evaluation principles, including various metrics (e.g., BLEU, ROUGE, perplexity, F1-score) and methodologies.
Proven ability to design and conduct rigorous experiments, analyze data, and draw meaningful conclusions.
Familiarity with large language models, transformer architectures, and related concepts.
Experience with data processing tools and techniques (e.g., Pandas, NumPy).
Experience with automated model evaluation frameworks and tools.
Experience with techniques for detecting and mitigating bias in AI models.

Responsibilities

Design, implement, and evaluate comprehensive evaluation pipelines for large generative AI models, encompassing various metrics and methodologies.
Evaluate the performance of publicly available models, and discuss their relative advantages and disadvantages.
Establish and maintain benchmarks for evaluating model performance across a range of tasks and datasets.
Conduct thorough error analysis to identify patterns in model failures and provide actionable insights for improvement.
Design and implement methods to detect and mitigate biases in model outputs, ensuring fairness and equitable performance.
Develop and execute robustness tests to assess model resilience against adversarial inputs, noise, and variations in real-world data.
Assess model safety, including identifying and mitigating harmful or inappropriate outputs.

Other

Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field.
10+ years of development experience
Excellent communication, collaboration, and problem-solving skills.
Experience with safety and alignment evaluation methodologies.
Experience with A/B testing and online evaluation techniques.