Abridge is looking to enhance the scalability, efficiency, and performance of its AI-driven solutions by optimizing the core infrastructure that powers its machine learning models.
Requirements
- Strong experience in building and deploying machine learning models in production environments.
- Deep understanding of container orchestration and distributed systems architecture
- Expertise in Kubernetes administration, including custom resource definitions, operators, and cluster management
- Experience developing APIs and managing distributed systems for both batch and real-time workloads
- Expertise with model serving frameworks such as NVIDIA Triton Server, VLLM, TRT-LLM and so on.
- Expertise with ML toolchains such as PyTorch, Tensorflow or distributed training and inference libraries.
- Familiarity with GPU cluster management and CUDA optimization
Responsibilities
- Design, deploy and maintain scalable Kubernetes clusters for AI model inference and training
- Develop, optimize, and maintain ML model serving and training infrastructure, ensuring high-performance and low-latency.
- Collaborate with ML and product teams to scale backend infrastructure for AI-driven products, focusing on model deployment, throughout optimization, and compute efficiency.
- Optimize compute-heavy workflows and enhance GPU utilization for ML workloads.
- Build a robust model API orchestration system
- Collaborate with leadership to define and implement strategies for scaling infrastructure as the company grows, ensuring long-term efficiency and performance.
Other
- Excellent communication skills, with the ability to interface between research and product engineering
- Extreme ownership—every employee has the ability to (and is expected to) make an impact on our customers and our business.
- Ability to make an impact on our customers and our business.
- Ability to work alongside a team of curious, high-achieving people in a supportive environment where success is shared, growth is constant, and feedback fuels progress.
- Empathy, always prioritizing the needs of clinicians and patients.