The Chan Zuckerberg Initiative is looking to solve some of society’s toughest challenges by leveraging AI/ML technology and data to drive AI-powered solutions that accelerate biomedical research and improvements in the education space.
Requirements
- Proficiency in managing large-scale data operations, including designing scalable pipelines (streaming and batch), working with varied data types, and optimizing flexible storage solutions using tools like Argo Workflows, Spark, Delta Lake, and Apache Iceberg
- Experience with data governance, metadata, and data lineage tooling like Open Lineage or Marquez.
- Deep experience working with building CI/CD pipelines for data infrastructure and associated observability and monitoring tooling such as Prometheus, Grafana, OpenTelemetry, Prometheus, or Honeycomb
- Experience with addressing end-to-end data needs for working with complex data and delivering this data ready form model training, working directly with AI Researchers and AI Engineers as part of AI model training project teams.
- Extensive experience with scaling containerized applications on Kubernetes or Mesos, with a focus on secure custom containers, replicability, and portability.
- Strong experience with AWS, GCP, or Azure.
- Familiarity with Infrastructure as Code (e.g., Terraform, Ansible) and monitoring tools (Datadog, Prometheus).
Responsibilities
- Develop and maintain the tooling and infrastructure that drives the entire data lifecycle at CZI, from ingestion and processing to secure storage and access.
- Partner with researchers and engineers across various domains, including genetics, imaging, and scientific literature.
- Design and implement flexible, scalable, and performant systems to address stakeholders’ needs, leveraging technologies like Argo Workflows and Spark for mass-scale job processing and orchestration.
Other
- BS, MS, or PhD in Computer Science or a related technical discipline, or equivalent experience.
- 7+ years of hands-on coding experience in scripting (Python, PHP, Ruby) and systems languages (Rust, C++, C, Go, Java, or Scala).
- Proven ability to work with diverse, cross-functional stakeholders and teams to navigate complex technical challenges, adapt to evolving requirements, and drive impactful solutions.
- Paid time off to volunteer at an organization of your choice.
- Funding for select family-forming benefits.
- Relocation support for employees who need assistance moving to the Bay Area