NVIDIA is looking for an expert in system-level software optimization to push their computer vision applications to near speed of light, addressing challenges of delivering performance at scale.
Requirements
- Proficiency with Python, CUDA and C++.
- Strong fundamentals with multi-threaded, multi-process and distributed software development.
- Expertise defining and driving performance metrics through profiling and benchmarking.
- Experience developing performance-critical data center and cloud applications (REST APIs, gRPC).
- Expertise in classical, non-ML computer vision
- Expertise in ML computer vision (VLMs, Vision Transformers, Diffusion models) and its software ecosystem: PyTorch, HuggingFace, vLLM
- Grounding in mathematical fundamentals such as linear algebra, numerical methods, statistics, and exploratory data analysis.
Responsibilities
- Develop, profile and optimize data-center and edge computer vision workloads for efficiency, latency, and throughput (Python).
- Implement and improve computer vision and image processing algorithms using CUDA.
- Upstream performance improvements to SDKs and libraries across NVIDIA to deliver accelerated computer vision at scale.
- Influence software architecture, validation strategy and technical roadmaps to ensure outstanding performance.
- Promote high-performance computer vision across NVIDIA teams and functions (Engineering, Product Management, Marketing, and more).
Other
- Excellent software engineering fundamentals (source control, CI/CD, testing/validation, packaging, containerization, release).
- Proven track record developing, testing and releasing production-grade, complex software.
- Excellent written, visual, and verbal communication to present performance challenges, tradeoffs, and architectural alternatives.
- Curiosity and drive to learn new technologies and partner across teams and functions.
- LI-Hybrid