Google DeepMind is looking to develop and deliver evaluations and analysis in established and emerging policy areas for their groundbreaking models, ensuring that their work is conducted in line with responsibility and safety best practices to progress towards their mission.
Requirements
- Strong analytical and statistical skills, with experience in metric design and development.
- Strong command of Python and SQL.
- Ability to work with both quantitative and qualitative data, understanding the strengths and weaknesses of each in specific contexts.
- Familiarity with AI evaluations and broader experimentation principles.
- Experience working with sensitive data, access control, and procedures for data worker wellbeing.
- Experience working in safety or security contexts (for example content safety or cybersecurity).
- Experience with safety evaluations and mitigations of advanced AI systems.
Responsibilities
- Developing new metrics and analytics approaches in key risk areas comprising both quantitative and qualitative data.
- Assessing the quality and coverage of evaluation datasets and methods.
- Influencing the design and development of future evaluations, and leading efforts to define novel testing and experimentation approaches.
- Converting high-level problems into detailed analytics plans, implementing those plans, and influencing others to support as necessary.
- Working with multidisciplinary specialists to measure and improve the quality of evaluation outputs.
- Contributing to and running evaluations and reporting pipelines.
- Providing an expert perspective on data usage, narrative, and interpretation in diverse projects and contexts.
Other
- Ability to present analysis and findings to both technical and non-technical teams, including senior stakeholders.
- A track record of transparency, with a demonstrated ability to identify limitations in datasets and analyses and communicate these effectively.
- Demonstrated ability to work within and lead cross-functional teams, fostering collaboration, and influencing outcomes.
- Ability to thrive in a fast-paced environment with a willingness to pivot to support emerging needs.
- Note that this role works with sensitive content or situations and may be exposed to graphic, controversial, and/or upsetting topics or content.