At Canva, our mission is to empower the world to design. To ensure our generative AI models are truly helpful, we are seeking a talented Research Engineer to build our next-generation evaluation system by leveraging automatic evaluations.
Requirements
- You have a strong understanding of generative AI models (e.g., Diffusion Models, GANs, Transformers) and their architectures, with practical experience that informs robust evaluation strategies
- You’ve successfully managed or optimized large-scale distributed model training across hundreds of GPUs
- You have a solid understanding of machine learning, have worked with PyTorch and know how to optimize such codes for speed
- You have disciplined coding practices, and are experienced with code reviews and pull requests.
- You have experience working in cloud environments, ideally AWS
- Familiarity with evaluation libraries and frameworks.
- Experience building or working with agentic AI systems or multi-agent coordination.
Responsibilities
- Design, build, and optimize the infrastructure for an "MLLM-as-a-Judge" evaluation system for scalable, automated feedback.
- Implement and experiment with inference-time alignment techniques (Prompt Engineering, RAG, ICL) to directly improve model output quality.
- Establish and manage a comprehensive benchmarking process to compare various foundation models on design-centric tasks.
- Analyze evaluation data to identify model failure modes and provide actionable recommendations to the research team.
- Collaborate with research scientists and ML engineers to integrate the agentic judge system into the model development lifecycle.
- Translate the latest research in LLM evaluation and agentic AI into practical, production-ready engineering solutions.
- Engineering autonomous AI agents that use Multimodal Large Language Models (MLLMs) to evaluate the quality, relevance, and human alignment of generated designs.
Other
- high-impact role focuses on building the practical systems that make cutting-edge research effective, to provide a rapid feedback loop that guides the future of design generation at Canva, ultimately empowering millions of users to create.
- Excel at creating data-driven evaluation methodologies, turning user analytics into clear, actionable insights.
- A background or interest in human-computer interaction, design principles.