Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Machine Learning Engineer — Inference

Fundamental Research Labs

Salary not specified

Sep 23, 2025

Menlo Park, CA, US

The company is looking to take their frontier AI models from the lab into production-ready services by building high-performance inference infrastructure.

Requirements

Strong experience in distributed systems and low-latency ML serving
Skilled with performance optimization tools and techniques, and experienced in developing solutions for critical performance gains
Hands-on with vLLM, SGLang, or equivalent frameworks
Familiarity with GPU optimization, CUDA, and model parallelism

Responsibilities

Architect and optimize high-performance inference infrastructure for large foundation models
Benchmark and improve latency, throughput, and agent responsiveness
Work with researchers to deploy new model architectures and multi-step agent behaviors
Implement caching, batching, and prioritization to handle high-volume requests
Build monitoring and observability into inference pipelines

Other

Full-time, onsite role in Menlo Park
Startup hours apply
Comfort working in a high-velocity, ambiguity-heavy startup environment