Morph is looking to solve the problem of building the fastest LLM code editing engine on the market, focusing on speculative decoding and high-throughput AI workflows for various applications.
Requirements
Have used an ML framework like Pytorch, Tensorflow, or JAX in projects or at work
Know your way around real infra: Docker, Kubernetes, Linux, observability
Strong understanding of Pytorch/TF/JAX
Prior experience with low level inference optimizations (ex. kernels)
Responsibilities
Build fast, reliable systems to apply LLM-generated edits.
Work across low-latency inference, containerized deployment, and CI/CD tooling
Work with kernels and bleeding edge inference research.
Implement the latest research into production quality systems