Heartflow is looking to hire a Software Engineer to support ML algorithm development through data systems and MLOps, aiming to improve the diagnosis and management of coronary artery disease with their AI-driven cardiac test.
Requirements
- 2+ years of relevant job experience in Software Engineering with hands-on experience with cloud-based distributed data systems.
- Strong foundation in software engineering principles and practices, with proficiency in Python and SQL.
- Deep understanding of modern distributed data cloud architectures for structured and unstructured data.
- Experience with distributed computing frameworks (e.g. Ray/Spark/Dask) and supporting infrastructure (e.g. Hadoop, Docker, Kubernetes).
- Competency with at least one cloud provider (e.g. AWS, GCP, Azure).
- Experience with infrastructure as code (CDK, Terraform).
- Experience constructing and maintaining data products for technical stakeholders.
Responsibilities
- Develop robust ETL (Extract, Transform, Load) processes to integrate data from diverse sources into our data ecosystem.
- Design and maintain scalable data pipelines that provide our teams with high-quality, training-ready datasets.
- Implement and manage tools to track and document data lineage, from source to consumption.
- Empower consumers of data products through detailed documentation.
- Develop and maintain a large-scale distributed computing platform for ML algorithm training and evaluation.
- Develop and maintain a standardized approach to ML algorithm experiment tracking using tools like MLFlow.
- Work cross-functionally with Researchers and Engineers to understand their needs for ML algorithm training and production monitoring.
Other
- The ideal candidate is passionate about using first-principles thinking to navigate challenging problems in data systems & ML infrastructure.
- Excellent communication and interpersonal skills, with the ability to communicate to both technical and non-technical audiences.
- Experience with image-based data and algorithms (e.g. convolutional neural networks, image processing techniques).
- Experience with orchestration frameworks (Dagster, AWS StepFunctions, Temporal.io).
- Prior experience working in a healthcare-domain or highly-regulated environment.