Recursion, a clinical-stage TechBio company, is seeking a Staff Engineer to build robust, scalable, and compliant data pipelines and systems to accelerate drug discovery and development in late-stage discovery and early clinical phases.
Requirements
- Minimum of 7+ years of experience in data engineering, with a strong focus on building and managing large-scale data pipelines.
- Demonstrated experience with GCS, distributed data processing, and machine learning workflows.
- Strong programming skills in Python. R is also a plus.
- In-depth understanding of data warehousing, data lakes, and modern data architecture patterns.
- Proven experience designing and implementing systems compliant with GxP regulations (e.g., GCP, GLP, GMP), 21 CFR Part 11, and other relevant regulatory frameworks.
- Experience with workflow orchestration tools (e.g., Prefect, Camunda, Airflow)
- Familiarity with cloud platforms, particularly Google Cloud Platform (GCP) and Kubernetes
Responsibilities
- Architect and Implement Scalable Data Pipelines: Design, build, and maintain high-performance, fault-tolerant data pipelines for processing large-scale scientific and clinical datasets.
- Drive Data Engineering Best Practices: Establish and champion best practices for data modeling, ETL/ELT processes, data quality, and data governance across late-stage discovery and clinical operations.
- Enhance Workflow Orchestration: Leverage and extend our existing orchestration platforms (e.g., Camunda, Prefect) to integrate data pipelines seamlessly into end-to-end scientific and clinical workflows, ensuring automation and reliability.
- Ensure GxP/GDPR/HIPAA Compliance: Design and implement data systems and processes that adhere to clinical regulations, 21 CFR Part 11, and other relevant industry standards, ensuring data integrity, auditability, and validation readiness.
- Collaborate with Cross-Functional Teams: Partner closely with data scientists, clinical operations specialists, and other engineering teams to understand data needs, translate requirements into technical solutions, and ensure successful data delivery.
- Optimize Data Performance and Cost: Continuously monitor, analyze, and optimize the performance and cost-efficiency of data processing and storage solutions.
- Provide Technical Leadership and Mentorship: Act as a technical leader within the team, providing guidance, mentorship, and code reviews to junior and mid-level engineers, fostering a culture of excellence and continuous learning.
Other
- Bachelor's or Master's degree in Computer Science, Data Engineering, or a related technical field, or equivalent practical experience.
- Excellent problem-solving, analytical, and debugging skills.
- Strong communication, collaboration, and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders.
- Ability to work effectively in a fast-paced, dynamic environment and manage multiple priorities.
- Spend 50% of their time in the office.