The company is looking to optimize software performance for XR technologies and launch commercial-level products, and needs an expert to analyze and improve performance, power, and SoC utilization.
Requirements
- Deep understanding of CPU, GPU, DSP, Deep Learning Accelerators (NSP/NPU) architectures, system programming and optimization of Multimedia/CV/ML algorithms on hardware acceleration cores using C/C++ and Python
- Strong knowledge of computer architecture, memory subsystem, parallel computing and compilers
- Good understanding of ML compute and chip microarchitecture
- Familiarity and hands-on experience with various system analysis tools used for memory, performance analysis and hardware resource management for optimization and stability
- High proficiency in architecture analysis and performance modeling, ranging from simple analytical models to complex cycle accurate performance model and correlation, especially around GPU and NPU
- Understanding of deep learning algorithms and experience with ML tuning and refinement with ML libraries & frameworks such as PyTorch, Tensorflow, ONNX, Caffe
Responsibilities
- Analyze XR workloads to identify bottlenecks in both hardware and software components
- Propose new HW/SW co-optimization methodologies to optimize performance and power of XR devices
- Suggest innovative co-optimization ideas to software engineers and hardware vendors
- Assist hardware vendors in developing optimal kernels and compilers to process critical XR applications
- Analyze performance estimates for different hardware configurations and kernels
- Build simulations and performance models for efficient decision making
Other
- Bachelor's or Master's or PhD in Computer Science/Engineering with specialization in Computer Architecture, Compliers, Parallel Computing, or equivalent combination of education, training, and experience
- 8+ years of experience in software/hardware co-design & optimization with the knowledge of the SoC hardware
- Strong teamwork, communication skills, passion, productivity, and self-learning ability
- Proven ability to work in a dynamic, multi-tasked environment
- Disclosure of Trade Secrets: Samsung has a strict policy on trade secrets
- Essential Job Functions: This position will be performed in an office setting