NVIDIA is seeking a Senior Engineering Manager to lead the NVIDIA Inference Microservices (NIM) Factory team. The goal is to build and scale a world-class engineering organization that delivers reliable, performant, and secure AI services at massive scale, delighting customers with day-0 model launches and enterprise-grade software.
Requirements
- Strong foundation in cloud‑native engineering (containers, Kubernetes, microservices) and modern SDLC practices (CI/CD, testing, observability).
- Proficiency with cloud languages such as ython; ability to read code, guide designs, and drive high‑quality engineering outcomes.
- Led teams that built and operated large‑scale LLM inference or model‑serving platforms (Triton, TensorRT‑LLM, vLLM) in production.
- Experience architecting next-generation container build systems or CI/CD platforms at scale.
- Contributions to open‑source ecosystems, technical publications, or talks in containers, Kubernetes, GPU, or inference communities.
Responsibilities
- Lead the NIM Factory engineering team (containers, orchestration, workflow, observability, platform APIs); attract, hire, onboard, and grow top talent.
- Define vision, strategy, and roadmap for how we build, ship, and operate NIM from day‑0 launch through enterprise‑grade hardening (security, reliability, performance, compliance).
- Own end‑to‑end delivery of cross‑functional programs; align stakeholders and manage dependencies.
- Drive predictable delivery across multiple programs; manage priorities, resourcing, schedules, and dependencies
- Establish engineering excellence: code health and reviews, documentation, CI/CD, testing.
- Collaborate with research and platform teams on inference architecture and scalable deployment patterns.
Other
- 10+ overall years building and delivering production software systems, including 5+ years leading engineering teams as a manager; experience leading multiple teams or managing managers is a plus.
- Proven track record driving complex, cross‑functional programs from inception to successful production launch and scale.
- Demonstrated ability to hire, coach, and develop senior engineers/tech leads; build inclusive teams and a culture of ownership and excellence.
- Excellent communication and stakeholder management; ability to influence across product, research, security, and operations.
- A degree in Computer Science, Computer Engineering, or a related field (BS or MS) or equivalent experience.