Design, prototype, and evaluate AI agents that automate data science tasks.
Requirements
- Strong research background in one or more of the following: AI agents and autonomous systems, data science automation, benchmarking and evaluation methodology, LLMs.
- Proficient in Python and familiarity with open-source tools/libraries for data science.
- Knowledge of LLM-based agents (e.g., LangChain, AutoGen) is a plus.
- Experience working with structured datasets and data-centric AI workflows is a plus.
Responsibilities
- Design, prototype, and evaluate AI agents that automate data science tasks (e.g., data wrangling, visualization, modeling).
- Develop benchmark datasets, tasks, and metrics for evaluating AI agents in realistic data science scenarios.
- Conduct empirical studies and ablation experiments to validate design choices and performance.
- Contribute to building reproducible pipelines and documentation for agent evaluation.
- Publish findings internally or externally (e.g., open-source repositories / benchmarks, academic conferences).
Other
- Currently pursuing a PhD in Computer Science, Data Science, Artificial Intelligence, Machine Learning, or a related field.
- Excellent analytical, problem-solving, and communication skills.
- Publication record in top AI/ML conferences