Zillow AI Applied Science team is developing next-generation evaluation methodologies for generative AI, computer vision, and agentic systems. The job is to help develop cutting-edge evaluation methodologies for AI systems, focusing on creating robust, scalable metrics and frameworks to assess the quality, consistency, and performance of generative models across multiple modalities.
Requirements
- Evaluation methodologies for AI/ML systems
- Computer vision metrics and 3D consistency assessment
- Generative model evaluation (text, image, video, 3D)
- Multi-modal assessment and automated feedback systems
- Knowledge of data privacy methods (e.g., differential privacy, federated learning, secure ML) and their application.
- Single agent or multi-agent system evaluations
- Familiarity with modern deep learning frameworks (e.g., PyTorch, Hugging Face Transformers)
Responsibilities
- Develop innovative assessment methodologies for emerging AI capabilities, focusing on consistency and quality across complex multi-modal outputs
- Design evaluation systems that learn and adapt from feedback, automatically discovering new evaluation criteria and improving assessment quality over time
- Design frameworks that incorporate domain-specific implementations of differential privacy to protect sensitive user information while maintaining utility for model training and assessment.
- Develop scalable methodologies for assessing agentic systems, ensuring compliance with fair housing standards and promoting ethical, responsible AI deployment
Other
- Currently enrolled as a PhD student in computer science, machine learning, computer vision, or a related field, with strong publication record
- Strong research mindset, with motivation to publish
- Interest in applying AI to complex, multi-stakeholder domains
- A record of publication in conferences, workshops, or journals is a plus
- Remote position