Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Netpreme Logo

Member of Technical Staff, ML Systems

Netpreme

Salary not specified
Dec 23, 2025
Santa Clara, CA, US • Cambridge, MA, US
Apply Now

Unlocking greater AI capability while dramatically improving efficiency at the infrastructure layer for LLM inference systems.

Requirements

  • Prior experience contributing to the core LLM inference infrastructures (vLLM, SGLang, TensorRT, etc.).
  • Prior experience in accelerator programming (e.g. CUDA, JAX/Pallas, ROCm).
  • Advanced computer architectures and performance engineering skills is a big plus.

Responsibilities

  • Prototype and optimize emerging ML inference systems.
  • Develop novel memory models for expandable vRAM.
  • Write efficient GPU kernels for data movement.
  • Perform design-space exploration, implementation, and benchmarking of inference engines, both in simulations and on real hardware.

Other

  • This role is part engineering, part research
  • This role will be performed on-site from one of our offices in Santa Clara, CA or Boston, MA.
  • Relocation assistance and visa sponsorship.
  • A collaborative, continuous-learning work environment with smart, dedicated colleagues engaged in developing the next generation of architecture for high-performance computing.
  • We value thoughtful disagreement, fast learning, and intellectual fearlessness.