Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Principal/Senior Principal Machine Learning Engineer - Llama.cpp

Red Hat

$189,600 - $351,050

Sep 16, 2025

Boston, MA, USA

Red Hat is looking to solve the problem of bringing the power of open-source LLMs and vLLM to every enterprise by accelerating AI for the enterprise and bringing operational simplicity to GenAI deployments.

Requirements

Extensive experience in writing high performance modern C++ code.
Strong experience with hardware acceleration libraries and backends: CUDA, Metal, Vulkan, or SYCL.
Strong fundamentals in machine learning and deep learning, with a deep understanding of transformer architectures and LLM inference.
Experience with performance profiling, benchmarking, and optimization techniques.
Proficient in Python.
Prior experience contributing to a major open-source project.

Responsibilities

Design and implement new features and optimizations for the llama.cpp core, including model architecture support, quantization techniques, and inference algorithms.
Optimize the codebase for various hardware backends, including CPU instruction sets, Apple Silicon (Metal), and other GPU technologies (CUDA, Vulkan, SYCL).
Conduct performance analysis and benchmarking to identify bottlenecks and propose solutions for improving latency and throughput.
Contribute to the design and evolution of core project components, such as the GGUF file format and the GGML tensor library.
Collaborate with the open-source community by reviewing pull requests, participating in technical discussions on GitHub, and providing guidance on best practices.

Other

Bachelor's, Master's, or Ph.D. degree in Computer Science or related field (not explicitly mentioned but implied)
Comprehensive medical, dental, and vision coverage
Flexible Spending Account - healthcare and dependent care
Retirement 401(k) with employer match
Paid time off and holidays