Google's Machine Learning (ML) Systems and Cloud AI (MSCA) organization needs to improve the quality of data used for building ML models, especially Large Language Models (LLMs), to enhance model performance and drive the future of hyperscale computing.
Requirements
- coding (e.g., Python, R, SQL)
- querying databases
- statistical analysis
- Experience with modern machine learning techniques
- sequence/transformer modeling experience in NLP/speech domains
Responsibilities
- Work with data sets and solve difficult, non-routine analysis problems, applying advanced problem-solving methods as needed. Conduct analysis that includes data gathering and requirements specification, processing, cleaning and curation, analysis, visualization, ongoing deliverables, and presentations.
- Share/present analysis to relevant stakeholders and organization executives in order to share insights, influence product direction and answer difficult questions regarding data quality and impact on model performance.
- Define data quality for data in various shapes and forms.
- Research and develop analysis and optimization methods to improve the quality of Google's Machine Learning (ML) portfolio and applications, including Large Language Model (LLM) model and training data planning.
Other
- Master's degree in Statistics, Data Science, Mathematics, Physics, Economics, Operations Research, Engineering, or a related quantitative field.
- 8 years of experience using analytics to solve product or business problems, coding (e.g., Python, R, SQL), querying databases or statistical analysis, or 6 years of experience with a PhD degree.
- 10 years of experience using analytics to solve product or business problems, coding (e.g., Python, R, SQL), querying databases or statistical analysis, or 8 years of experience with a PhD degree.
- 3 years of experience as a people manager within a technical leadership role.
- Interact cross-functionally with a wide variety of product and model teams.