Accelerating the pace of bringing new, effective medicines to patients by developing better drugs, faster using artificial intelligence (AI) and human data, and transforming existing infrastructure into a highly scalable data platform
Requirements
- 4+ years of experience in data engineering with hands-on pipeline development
- Strong proficiency in Python and SQL
- Experience with cloud platforms (AWS, GCP, or Azure) and infrastructure-as-code tools
- Proven experience with distributed data processing frameworks (Spark, Kafka, Flink)
- Solid understanding of database technologies (PostgreSQL, MongoDB, Redis, data warehouses, graph databases)
- Experience with containerization (Docker, Kubernetes) and orchestration tools
- Previous experience building and scaling SaaS applications
Responsibilities
- Design and build scalable data infrastructure to support a multi-tenanted Biotech data and analysis platform
- Architect data pipelines that handle diverse scientific data sources and formats
- Develop APIs and integration points for data ingestion and export
- Implement robust data processing workflows using modern frameworks (Apache Spark, Kafka, Airflow, etc.)
- Lead technical migration from single-tenant internal tool to multi-tenant SaaS platform
- Create self-service capabilities including user dashboards, data exploration tools, and automated reporting
- Set up cloud infrastructure (AWS/GCP/Azure) with focus on scalability and cost optimization
Other
- Ability to thrive in uncertainty with frequently changing priorities
- Deep alignment with our values
- A passion for making an impact on patients
- 4+ years of experience
- Ability to work alongside Verge's platform and computational biology teams