Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Machine Learning Engineer - Inference

$160,000 - $230,000

Nov 4, 2025

San Francisco, CA, US

Together AI is seeking to optimize and enhance the performance of their AI inference systems.

Proficiency with Python and PyTorch.
Demonstrated experience in building high performance libraries and tooling.
Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, storage, performance, and scale.
Knowledge of existing AI inference systems such as TGI, vLLM, TensorRT-LLM, Optimum
Knowledge of AI inference techniques such as speculative decoding.
Knowledge of CUDA/Triton programming.

Design and build the production systems that power the Together AI inference engine, enabling reliability and performance at scale.
Develop and optimize runtime inference services for large-scale AI applications.
Implement robust and fault-tolerant systems for data ingestion and processing.
Create services, tools, and developer documentation to support the inference engine.
Conduct design and code reviews to ensure high standards of quality.

3+ years of experience writing high-performance, well-tested, production-quality code.
Collaborate with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world.
US base salary range for this full-time position is $160,000 - $230,000 + equity + benefits.