Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Principal Machine Learning Engineer, Distributed vLLM Inference

Red Hat

$189,600 - $312,730

Oct 5, 2025

Boston, MA, US

Red Hat is looking to solve the problem of scalable inference systems and Kubernetes-native deployments for enterprise AI, by developing and maintaining distributed vLLM infrastructure and bringing operational simplicity to GenAI deployments.

Requirements

Strong proficiency in Python and at least one systems programming language (GoLang, Rust, or C++), with GoLang being highly preferred.
Experience with cloud-native Kubernetes service mesh technologies/stacks such as Istio, Cilium, Envoy (WASM filters), and CNI.
A solid understanding of Layer 7 networking, HTTP/2, gRPC, and the fundamentals of API gateways and reverse proxies.
Working knowledge of high-performance networking protocols and technologies including UCX, RoCE, InfiniBand, and RDMA is a plus.
Experience with the Kubernetes ecosystem, including core concepts, custom APIs, operators, and the Gateway API inference extension for GenAI workloads.
Experience with GPU performance benchmarking and profiling tools like NVIDIA Nsight or distributed tracing libraries/techniques like OpenTelemetry.
Ph.D. in an ML-related domain is a significant advantage

Responsibilities

Develop and maintain distributed inference infrastructure leveraging Kubernetes APIs, operators, and the Gateway Inference Extension API for scalable LLM deployments.
Create system components in Go and/or Rust to integrate with the vLLM project and manage distributed inference workloads.
Design and implement KV cache-aware routing and scoring algorithms to optimize memory utilization and request distribution in large-scale inference deployments.
Enhance the resource utilization, fault tolerance, and stability of the inference stack.
Contribute to the design, development, and testing of various inference optimization algorithms.
Actively participate in technical design discussions and propose innovative solutions to complex challenges.
Provide timely and constructive code reviews.

Other

A Bachelor's or Master's degree in computer science, computer engineering, or a related field.
Excellent communication skills, capable of interacting effectively with both technical and non-technical team members.
Mentor and guide fellow engineers, fostering a culture of continuous learning and innovation.
Comprehensive medical, dental, and vision coverage
Paid time off and holidays