Lenovo's Advanced AI Technology Center (AAITC) is seeking to define the next era of computing powered by AI. The Sr. AI Model Evaluation Engineer will play a critical role in assessing the performance, robustness, and safety of large language models (LLMs), large vision models (LVMs), and large multimodal models (LMMs) to contribute to cutting-edge research and development in generative AI and deploy them into innovative products.
Requirements
- Strong programming skills in Python and experience with deep learning frameworks like PyTorch.
- Deep understanding of machine learning evaluation principles, including various metrics (e.g., BLEU, ROUGE, perplexity, F1-score) and methodologies.
- Proven ability to design and conduct rigorous experiments, analyze data, and draw meaningful conclusions.
- Familiarity with large language models, transformer architectures, and related concepts.
- Experience with data processing tools and techniques (e.g., Pandas, NumPy).
- Experience working with Linux systems and/or HPC cluster job scheduling (e.g., Slurm, PBS).
- Experience with automated model evaluation frameworks and tools.
Responsibilities
- Design, implement, and evaluate comprehensive evaluation pipelines for large generative AI models, encompassing various metrics and methodologies.
- Evaluate the performance of publicly available models, and discuss their relative advantages and disadvantages.
- Establish and maintain benchmarks for evaluating model performance across a range of tasks and datasets.
- Conduct thorough error analysis to identify patterns in model failures and provide actionable insights for improvement.
- Design and implement methods to detect and mitigate biases in model outputs, ensuring fairness and equitable performance.
- Develop and execute robustness tests to assess model resilience against adversarial inputs, noise, and variations in real-world data.
- Assess model safety, including identifying and mitigating harmful or inappropriate outputs.
Other
- Bachelor's or Master's degree in Computer Science, Machine Learning, or a related field.
- 12+ years of development experience
- Excellent communication, collaboration, and problem-solving skills.
- Experience with techniques for detecting and mitigating bias in AI models.
- Experience with safety and alignment evaluation methodologies.