Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Lead Software Engineer, Model Serving Platform

Sciforium

Salary not specified

Dec 6, 2025

San Francisco, CA, US

Sciforium is looking to solve the problem of efficiently serving next-generation multimodal AI models and real-time applications by developing a proprietary, high-efficiency serving platform. The role aims to architect and lead the development of this platform, which will bring a multimodal, highly efficient foundation model to market.

Requirements

5+ years of experience designing and building scalable, reliable backend systems or distributed infrastructure.
Strong understanding of LLM inference mechanics (prefill vs decode, batching, KV cache)
Experience with Kubernetes/Ray, Containerization
Strong proficiency in C++, Python.
Strong debugging, profiling, and performance optimization skills at the system level.
Ability to collaborate closely with ML researchers and translate model or runtime requirements into production-grade systems.
Proficiency in CUDA or ROCm and experience with GPU profiling tools

Responsibilities

Lead the technical direction of the model serving platform, owning architecture decisions and guiding engineering execution.
Build core serving components including execution runtimes, batching, scheduling, and distributed inference systems.
Develop high-performance C++ and CUDA/HIP modules, including custom GPU kernels and memory-optimized runtimes.
Collaborate with ML researchers to productionize new multimodal models and ensure low-latency, scalable inference.
Build Python APIs and services that expose model capabilities to downstream applications.
Mentor and support other engineers through code reviews, design discussions, and hands-on technical guidance.
Drive performance profiling, benchmarking, and observability across the inference stack.

Other

Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience
Effective communication skills and the ability to lead technical discussions, mentor engineers, and drive engineering quality.
Comfortable working from the office and contributing to a fast-moving, high-ownership team culture.
Experience at an AI/ML startup, research lab, or Big Tech infrastructure/ML team.
Competitive salary and equity