Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Red Hat Logo

Senior ML Ops Engineer - VLLM Inference

Red Hat

$133,650 - $220,680
Sep 24, 2025
Boston, MA, USA • • Kilkenny, Éire / Ireland • Dublin, Éire / Ireland
Apply Now

Red Hat is looking to accelerate AI for the enterprise and bring operational simplicity to GenAI deployments by scaling SOTA deep learning products and software, and building and releasing Red Hat AI Inference runtimes.

Requirements

  • 2+ years of experience in MLOps, DevOps, Automation and modern Software Deployment practices
  • Experience evaluating LLMs for performance on accelerators and accuracy (think HellaSwag, MMLU, Chatbot Arena, TruthfulQA, etc.).
  • Being super comfortable with Python and PyTest is a must.
  • Strong experience with Git, Github Actions including self-hosted runners, Terraform, Jenkins, Ansible, and common technologies for automation and monitoring
  • Highly experienced with administering Kubernetes/Openshift
  • Experience with Cloud Computing using at least one of the following Cloud infrastructures: AWS, GCP, Azure, or IBM Cloud
  • Solid programming skills especially in Python

Responsibilities

  • Collaborate with research and product development teams to scale machine learning products for internal and external applications
  • Create and manage model training and deployment pipelines
  • Actively contribute to managing and releasing upstream and midstream product builds
  • Test to ensure correctness, responsiveness, and efficiency
  • Troubleshoot, debug and upgrade Dev & Test pipelines
  • Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
  • Keep abreast of the latest technologies and standards in the field

Other

  • Collaborate with a cross-functional team about market requirements and best practices
  • Familiar with Agile development methodology
  • Solid troubleshooting skills
  • Ability to interact comfortably with the other members of a large, geographically dispersed team
  • Experience maintaining an infrastructure and ensuring stability