Xaira Therapeutics needs a Head of Data Engineering to build and lead a team responsible for the design, integration, and governance of their scientific data ecosystem to accelerate AI-driven drug discovery.
Requirements
- Experience developing and implementing robust computational pipelines to enable the generation of biological insights.
- Experience with large scale data engineering in the life sciences space.
- Extensive experience in data integration, pipeline engineering, and managing complex scientific data ecosystems, including integrating and managing laboratory data using Benchling or similar systems.
- Experience with Data lake/data warehouse and metadata tools
- Experience using GitHub and Docker (or equivalents) for reproducible software development and deployment.
- Understanding of data governance, reproducibility, and provenance practices.
- Familiarity with AWS and Terraform for cloud infrastructure management.
Responsibilities
- Oversee design, development, and management of scalable scientific data infrastructure that spans from the lab to cloud-based data infrastructure and analytical applications.
- Develop data strategy and lead integration efforts with laboratory information management systems (e.g., Benchling), ensuring efficient data capture and automated data accessioning.
- Develop and lead efforts to convert internally and externally generated data into data repositories for training and validation of AI models
- Implement and maintain data governance standards, ensuring quality, interoperability, provenance, and accessibility.
- Support AI, bioinformatics, and computational biology teams by building robust engineering platforms, including those deployed via Nextflow and other workflow systems
- Establish best practices for continuous integration and continuous delivery (CI/CD) to ensure reproducibility and consistency.
- Collaborate closely with stakeholders to translate complex data into clear, actionable insights, including the creation and management of analytical dashboards and interactive applications.
Other
- Mentor and lead a high-performing data engineering team, fostering collaboration, technical excellence, and continuous improvement.
- Serve as a liaison between technical teams, stakeholders, and leadership, leading cross-functional collaborations.
- Promote and embed best practices in data engineering and governance across the organization, bringing industrial development and engineering practices to biological software development.
- Track record of building and leading bioinformatics engineering, data science, data engineering, or scientific computing teams in a biotech or pharmaceutical setting.
- Demonstrated ability to work collaboratively in a multidisciplinary setting.
- Strong oral and written communication skills
- PhD in bioinformatics, computational biology, computer science, or a related field with 10+ years of relevant experience, or MS with 12+ years of experience.
- Knowledge of regulatory compliance and handling of sensitive data.