Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Fal Logo

Staff Technical Lead for Inference & ML Performance

Fal

Salary not specified
Oct 29, 2025
Remote, US
Apply Now

fal is looking for a Staff Technical Lead for Inference & ML Performance to guide a team in building and optimizing state-of-the-art inference systems for generative-media infrastructure, aiming to push the boundaries of model inference performance for seamless creative experiences at unprecedented scale.

Requirements

  • Are deeply experienced in ML performance optimization. You've optimized inference for large-scale generative models in production environments.
  • Understand the full ML performance stack. From PyTorch, TensorRT, TransformerEngine, Triton to CUTLASS kernels, you’ve navigated and optimized them all.
  • Know inference inside-out. Expert-level familiarity with advanced inference techniques: quantization, kernel authoring, compilation, model parallelism (TP, context/sequence parallel, expert parallel), distributed serving and profiling.
  • Experience building inference engines specifically for diffusion and generative media models
  • Track record of industry-leading performance improvements (papers, open-source contributions, benchmarks)

Responsibilities

  • Set technical direction. Guide your team (kernels, applied performance, ML compilers, distributed inference) to build high-performance inference solutions.
  • Hands-on IC leadership. Personally contribute to critical inference performance enhancements and optimizations.
  • Collaborate closely with research & applied ML teams. Influence model inference strategies and deployment techniques.
  • Drive advanced performance optimizations. Implement model parallelism, kernel optimization, and compiler strategies.
  • Mentor and scale your team. Coach and expand your team of performance-focused engineers.

Other

  • Lead from the front. You're a respected IC who enjoys getting hands-on with the toughest problems, demonstrating excellence to inspire your team.
  • Thrive in cross-functional collaboration. Comfortable interfacing closely with applied ML teams, researchers, and stakeholders.
  • Leadership experience in scaling technical teams