Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Red Hat Logo

Principal Software Engineer, AI Model Serving

Red Hat

$148,540 - $245,050
Aug 28, 2025
Raleigh, NC, US
Apply Now

Red Hat OpenShift AI (RHOAI) is looking for a Principal Software Engineer with Kubernetes and MLOps (Machine Learning) experience to join their rapidly growing engineering team to create a platform, partner ecosystem, and community by which enterprise customers can solve problems to accelerate business success using AI.

Requirements

  • Proven expertise with Kubernetes API development and testing (CRs, Operators, Controllers), including reconciliation logic.
  • Strong background with model serving (like KServe, vLLM) and distributed inference strategies for LLMs (tensor, pipeline, data parallelism).
  • Deep understanding of GPU optimization, autoscaling (KEDA/Knative), and low-latency networking (e.g., NVLink, P2P GPU).
  • Experience architecting resilient, secure, and observable systems for model serving, including metrics and tracing.
  • Advanced skills in Go and Python; ability to design APIs for high-performance inference and streaming.
  • Excellent system troubleshooting skills in cloud environments and the ability to innovate in fast-paced environments.
  • An existing contributor in one or more MLOps open source projects such as KubeFlow, KServe, RayServe, and vLLM is a huge plus

Responsibilities

  • Lead the team strategy and implementation for Kubernetes-native components in Model Serving, including Custom Resources, Controllers, and Operators.
  • Be an influencer and leader in MLOps-related open source communities to help build an active MLOps open source ecosystem for Open Data Hub and OpenShift AI
  • Architect and design new features for open-source MLOps communities such as KubeFlow and KServe
  • Provide technical vision and leadership on critical and high-impact projects
  • Ensure non-functional requirements including security, resiliency, and maintainability are met
  • Write unit and integration tests and work with quality engineers to ensure product quality
  • Use CI/CD best practices to deliver solutions as productization efforts into RHOAI

Other

  • Mentor, influence, and coach a team of distributed engineers
  • Contribute to a culture of continuous improvement by sharing recommendations and technical knowledge with team members
  • Collaborate with product management, other engineering, and cross-functional teams to analyze and clarify business requirements
  • Communicate effectively to stakeholders and team members to ensure proper visibility of development efforts
  • Give thoughtful and prompt code reviews