EdPlus at ASU is seeking an AI Evaluation Engineer to evaluate and optimize AI models, particularly large language models (LLMs), to advance online higher education and increase student success.
Requirements
- At least 4 years of hands-on software development and/or AI solutioning experience.
- At least 4 years of hands-on data modeling and predictive analytics experience.
- Experience developing AI/ML solutions on platforms like AWS, GCP, Azure, OpenAI, Databricks, Snowflake, etc.
- Demonstrated understanding of software development lifecycle and standard methodologies.
- Experience with data manipulation and analysis using tools like Python and SQL
- Expertise with data cleaning and processing, model training, and evaluation techniques.
- Demonstrated ability to preprocess and curate text data, including cleaning, tokenization, and data augmentation, to prepare it for training and model evaluation.
Responsibilities
- Gather and preprocess structured and unstructured datasets, ensuring data quality and suitability for AI model evaluation.
- Utilize AI-driven tools to evaluate model outputs for factual accuracy, relevance, and completeness, especially for large language models (LLMs).
- Assess model performance using standard metrics (e.g., accuracy, precision, recall, F1 score) and advanced evaluation techniques.
- Conduct comparative analysis of multiple LLMs, algorithms and prompts to select the best-performing model based on specific KPIs and business goals.
- Detect and mitigate any biases in model predictions, ensuring fairness and reducing the risk of harmful outputs.
- Develop strategies to identify and eliminate hallucinations and other unintended behaviors in the model.
- Develop and implement continuous monitoring systems to track model performance in real time, detecting anomalies, degradation, or model drift.
Other
- Bachelor's degree and three (3) years of experience appropriate to the area of assignment/field; OR, Any equivalent combination of experience and/or training from which comparable knowledge, skills and abilities have been achieved.
- Ability to communicate with cross-functional teams about various AI topics.
- Demonstrated ability to communicate thoughtfully, using problem-solving skills and build positive working relationships with cross-functional teams.
- Must be able to reliably commute to Scottsdale Arizona three days a week.
- Applicant must be eligible to work in the United States.