Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Advisory Engineer - AI Model Evaluation

Lenovo

$180,000 - $240,000

Oct 1, 2025

San Jose, CA, USA

The company is looking to assess the performance, robustness, and safety of large generative AI models (LLMs, LVMs, LMMs) to push the boundaries of AI and deploy them into innovative products, realizing their Hybrid AI vision.

Requirements

Strong programming skills in Python and experience with deep learning frameworks like PyTorch.
Deep understanding of machine learning evaluation principles, including various metrics (e.g., BLEU, ROUGE, perplexity, F1-score) and methodologies.
Proven ability to design and conduct rigorous experiments, analyze data, and draw meaningful conclusions.
Familiarity with large language models, transformer architectures, and related concepts.
Experience with data processing tools and techniques (e.g., Pandas, NumPy).
Experience working with Linux systems and/or HPC cluster job scheduling (e.g., Slurm, PBS).
Experience with automated model evaluation frameworks and tools.

Responsibilities

Design, implement, and evaluate comprehensive evaluation pipelines for large generative AI models, encompassing various metrics and methodologies.
Evaluate the performance of publicly available models, and discuss their relative advantages and disadvantages.
Establish and maintain benchmarks for evaluating model performance across a range of tasks and datasets.
Conduct thorough error analysis to identify patterns in model failures and provide actionable insights for improvement.
Design and implement methods to detect and mitigate biases in model outputs, ensuring fairness and equitable performance.
Develop and execute robustness tests to assess model resilience against adversarial inputs, noise, and variations in real-world data.
Assess model safety, including identifying and mitigating harmful or inappropriate outputs.

Other

Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field.
10+ years of development experience
Ph.D. in Computer Science, Machine Learning, or a related field.
Excellent communication, collaboration, and problem-solving skills.
Experience with safety and alignment evaluation methodologies.