NVIDIA is seeking to advance the AI revolution by developing intelligent assistants and information retrieval systems, leveraging its NVIDIA NIM and NeMo Retriever technologies to provide high-performance, GPU-accelerated AI solutions for complex problems in Generative AI, LLM, MLLM, and RAG spaces.
Requirements
- Python programming expertise with Deep Learning (DL) frameworks such as PyTorch.
- Experience delivering software in a cloud context and is familiar with the patterns and processes of handling cloud infrastructure
- Knowledge of MLOps technologies such as Docker-Compose, Containers, Kubernetes, Helm, data center deployments, etc.
- Familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM.
- Excellent in-depth hands-on understanding of NLP, LLM, MLLM, Generative AI, and RAG workflows
Responsibilities
- Develop and maintain NIMs that containerize optimized models using OpenAPI standards using Python or an equivalent performant language.
- Work closely with partner teams to understand requirements, build & evaluate POCs, and develop roadmaps for production-level tools
- Enable development of integrated systems - AI Blueprints that provide a unified, turnkey experience.
- Help build and maintain our Continuous Delivery pipeline with the goal of moving changes to production faster and safer while ensuring key operational standards.
- Provide peer reviews to other specialists, including feedback on performance, scalability, and correctness.
Other
- Bachelor’s or Master’s Degree program in Computer Science, Computer Engineering, or a related field (or equivalent experience).
- 8+ years of demonstrated experience in a similar or related role
- Self-starter with a passion for growth, enthusiasm for continuous learning, and sharing findings across the team
- Extremely motivated, highly passionate, and curious about new technologies.