Stanford Data Science is seeking a Research Data Scientist to support a new initiative, Marlowe, a GPU-centric high-performance computing instrument designed to enable large-scale, data-intensive research. The role will leverage expertise in computational research to develop and optimize workflows and applications that unlock Marlowe’s capabilities, enabling sophisticated machine learning applications, including large-language models, and addressing complex research challenges across various disciplines.
Requirements
- Deep understanding of computational and data science, machine learning, and the scientific process.
- Ability to leverage high-performance GPU computing to efficiently process and analyze large datasets.
- Comfortable running and troubleshooting jobs in a batch scheduled environment
- Considerable experience with Linux
Responsibilities
- Collaborate with Principal Investigators (PIs) and research groups to architect and optimize GPU-accelerated pipelines.
- Develop innovative computational methodologies
- Design advanced data movement strategies to minimize memory bottlenecks between CPU and GPU, including real-time data streaming methods for scientific applications.
- Partner with research teams to design novel algorithms and develop high-quality, reusable software to accelerate complex research projects.
- Assist PIs in applying for supercomputing resources at national centers once projects are scaled and workloads are appropriate.
- Install, configure, and maintain software stacks for core research functions.
- Integrate open science principles into research workflows, including software for data and computational provenance.
Other
- 3-year fixed term appointment.
- Remote work for the Research Data Scientist position will be considered.
- The Research Data Scientist may be asked to attend certain in-person work events during the year regardless of remote status.
- Experience supervising technical staff including training, mentoring and coaching.
- Experience developing and writing grant proposals.