Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Rad AI Logo

Senior Machine Learning Engineer, Infrastructure

Rad AI

Salary not specified
Sep 11, 2025
Remote, US
Apply Now

Rad AI is looking to hire a Senior Machine Learning Engineer to build and maintain the infrastructure that supports their AI research and products, specifically focusing on accelerating language model R&D and serving these models to radiologists to improve clinical outcomes.

Requirements

  • 5+ years of industry experience in ML Engineering in cloud-native environments
  • In-depth knowledge of Python and Javascript/Typescript (preferable), or other modern languages in the ML domain
  • Strong experience with infrastructure and DevOps tools such as Kubernetes, Docker, and Ansible
  • Experience in distributed systems, storage systems, and databases
  • Strong knowledge of cloud computing platforms such as AWS (preferable), GCP, and Azure.
  • Experience with infrastructure-as-code tools such as Terraform (preferable), Pulumi, Cloud Formation, etc.
  • Experience with monitoring, tracing, and logging tools such Cloudwatch, NewRelic, Grafana, etc.

Responsibilities

  • Design, implement, and maintain the infrastructure that supports our machine learning applications, services, and workflows
  • Build, maintain, and improve our ML platform that supports continuous integration, continuous delivery, and continuous training for our machine learning models
  • Develop fullstack, cloud-native services and serverless architectures to build scalable and resilient systems
  • Plan, design, and develop components in the data pipeline to enable various machine learning models in production
  • Write code that meets our internal standards for security, style, maintainability, and best practices for a high-scale HIPAA web environment
  • Design, deploy, and maintain the full ML platform stack including monitoring and observability, data analytics, backend integration with customer-facing products, and the full model R&D lifecycle
  • Work with Product Management, Research, and Engineering to iterate on new features and address inefficiencies across our AI/ML infrastructure

Other

  • Excellent communication skills, with a strong sense of ownership and a systematic approach to problem-solving
  • Proven ability to manage and lead active incidents, address what caused them, and establish systems to avoid them in the future via blameless postmortems
  • Experience working at a fast-growing startup
  • Experience in a HIPAA-compliant environment
  • Location Flexibility (Remote-first company!)