Advancing exploratory data analysis (EDA) capabilities in large language models.
Requirements
- Proficiency with Python and core data science libraries (pandas, numpy, scikit-learn).
- Strong foundation in statistical modeling, data analysis, and machine learning principles.
- Ability to identify meaningful patterns, trends, and associations within complex datasets.
Responsibilities
- Review AI-generated exploratory data analysis outputs for accuracy and quality.
- Apply statistical methods and modeling techniques to validate insights across structured and unstructured datasets.
- Design prompts and detailed evaluation rubrics for fine-grained performance assessment.
- Deliver gold-standard examples, including clear visualizations, explanatory text, and Python notebooks.
- Translate data-driven reasoning and decision-making into clear, gradable criteria for AI training.
Other
- Bachelor’s, Master’s, or PhD in Data Science, Computer Science, Statistics, Mathematics, or related field, OR 2+ years of industry experience in a data science role.
- Excellent analytical writing skills with the ability to communicate insights clearly and concisely.
- Complete a short technical interview (25 minutes) including project-based conceptual questions and a Google Co-Lab data analysis exercise.