Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AI Evaluation Engineer

Arizona State University

$95,000 - $105,000

Sep 10, 2025

Scottsdale, AZ, USA

EdPlus at ASU is seeking an AI Evaluation Engineer to evaluate and optimize AI models, particularly large language models (LLMs), to advance online higher education and increase student success.

Requirements

At least 4 years of hands-on software development and/or AI solutioning experience.
At least 4 years of hands-on data modeling and predictive analytics experience.
Experience developing AI/ML solutions on platforms like AWS, GCP, Azure, OpenAI, Databricks, Snowflake, etc.
Demonstrated understanding of software development lifecycle and standard methodologies.
Experience with data manipulation and analysis using tools like Python and SQL
Expertise with data cleaning and processing, model training, and evaluation techniques.
Demonstrated ability to preprocess and curate text data, including cleaning, tokenization, and data augmentation, to prepare it for training and model evaluation.

Responsibilities

Gather and preprocess structured and unstructured datasets, ensuring data quality and suitability for AI model evaluation.
Utilize AI-driven tools to evaluate model outputs for factual accuracy, relevance, and completeness, especially for large language models (LLMs).
Assess model performance using standard metrics (e.g., accuracy, precision, recall, F1 score) and advanced evaluation techniques.
Conduct comparative analysis of multiple LLMs, algorithms and prompts to select the best-performing model based on specific KPIs and business goals.
Detect and mitigate any biases in model predictions, ensuring fairness and reducing the risk of harmful outputs.
Develop strategies to identify and eliminate hallucinations and other unintended behaviors in the model.
Develop and implement continuous monitoring systems to track model performance in real time, detecting anomalies, degradation, or model drift.

Other

Bachelor's degree and three (3) years of experience appropriate to the area of assignment/field; OR, Any equivalent combination of experience and/or training from which comparable knowledge, skills and abilities have been achieved.
Ability to communicate with cross-functional teams about various AI topics.
Demonstrated ability to communicate thoughtfully, using problem-solving skills and build positive working relationships with cross-functional teams.
Must be able to reliably commute to Scottsdale Arizona three days a week.
Applicant must be eligible to work in the United States.