Pinterest is seeking a Sr. Engineering Manager to lead the team that builds the serving and deployment infrastructure for all ML models at Pinterest, aiming to ensure ML systems are efficient, healthy, and fast for modelers to iterate upon.
Requirements
- Experience building large-scale distributed serving systems
- Experience with ML inference technologies for online serving at Web scale
- Experience developing engineering platforms: deep customer understanding
- Ultra-high-performance C++ model inference engine for production recommendations and content ranking systems.
- TorchScript + CUDA Graph models on GPU inference, serving 500+M inferences/second.
- Production GenAI & LLM model inference stack for emerging use cases.
- Model routing, deployment, monitoring.
Responsibilities
- Lead the team to deliver continual improvements in advanced model architectures, cost-efficient resource utilization, and ML developer productivity.
- Set technical direction for the team based on company and org priorities.
- Build the serving and deployment infrastructure for all ML models at Pinterest.
- Develop and maintain ultra-high-performance C++ model inference engine for production recommendations and content ranking systems.
- Develop and maintain production GenAI & LLM model inference stack for emerging use cases.
- Develop and maintain model routing, deployment, and monitoring systems.
- Develop and maintain feature fetching, caching, and logging systems.
Other
- Experience managing platform engineering teams with many cross-organizational customers
- Coach and develop talent on the team.
- This role will need to be in the office for in-person collaboration 1-2 times/quarter and therefore can be situated anywhere in the country.
- This position is not eligible for relocation assistance.
- US based applicants only