NVIDIA is looking for an engineering expert to help productize and optimize the latest Vision Language Models (VLMs) and their pipelines to democratize their use and unlock innovative applications.
Requirements
- Expertise in AI computer vision (VLMs, Vision Transformers, Diffusion models). Proven track record using its software ecosystem (PyTorch, HuggingFace, vLLM) to develop and release production-grade software.
- Excellent software engineering fundamentals (source control, CI/CD, testing/validation, packaging, containerization, release).
- Proficiency with Python, C++ and CUDA (kernel optimization)
- Experience developing cloud applications (REST APIs, gRPC).
- Expertise in classical, non-ML computer vision
- Strong fundamentals with system-level performance: multi-threaded, multi-process and distributed software development.
- Grounding in mathematical fundamentals such as linear algebra, numerical methods, statistics, and exploratory data analysis.
Responsibilities
- Develop, profile and optimize inference pipelines for VLMs and other AI CV models: improve throughput and latency, data loading, pre- and post-processing.
- Improve the efficiency of VLM models themselves: kernel optimization in CUDA
- Upstream improvements to SDKs and libraries across NVIDIA and beyond to deliver accelerated computer vision at scale.
- Promote high-performance AI computer vision across NVIDIA teams and functions (Engineering, Product Management, Marketing, and more).
Other
- Master's of Science in Computer Science or Electrical engineering or equivalent experience.
- 8 years practical experience or equivalent
- Excellent written, visual, and verbal communication to present performance challenges, tradeoffs, and architectural alternatives.
- Curiosity and drive to learn new technologies and partner across teams and functions.
- History of creativity and innovation around performance in multiple problem domains.