Tackling critical challenges in medical imaging and diagnostics by building technology that directly impacts patient outcomes.
Requirements
- 3+ years building ML infrastructure, data pipelines, or ML systems in production
- Strong Python skills and expertise in PyTorch or JAX
- Hands-on experience with data pipeline technologies (e.g., Spark, Airflow, BigQuery, Snowflake, Databricks, Chalk) and schema design
- Experience with distributed systems, cloud infrastructure (AWS/GCP), and containerization (Docker/Kubernetes)
- Track record of building scalable data systems and shipping production ML infrastructure
- Experience with medical imaging formats (DICOM) and healthcare data standards
- Background in distributed training frameworks (PyTorch Lightning, DeepSpeed, Accelerate)
Responsibilities
- Build and optimize distributed ML infrastructure for training foundation models on large-scale medical imaging datasets.
- Design and implement robust data pipelines to collect, process, and store large-scale multimodal medical imaging data from both production traffic and offline sources.
- Build centralized data storage solutions with standardized formats (e.g., protobufs) that enable efficient retrieval and training across the organization.
- Create model inference pipelines and evaluation frameworks that work seamlessly across research experimentation and production deployment.
- Collaborate with researchers to rapidly prototype new ideas and translate them into production-ready code.
- Own end-to-end delivery of ML systems from experimentation through deployment and monitoring.
Other
- Ability to move quickly and handle competing priorities in a fast-paced environment
- Familiarity with MLOps practices and model deployment pipelines
- Experience with privacy-preserving data systems and HIPAA compliance
- Contributions to open-source ML or data infrastructure projects