Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AMD Logo

Software Development Engineer – GPU Kernel Development

AMD

$143,280 - $214,920
Jun 10, 2025
Austin, TX, US
Apply Now

AMD is looking to optimize and develop deep learning frameworks for AMD GPUs to enhance GPU kernels, deep learning models, and training/inference performance across multi-GPU and multi-node systems.

Requirements

  • GPU Kernel Development & Optimization: Experienced in designing and optimizing GPU kernels for deep learning on AMD GPUs using HIP, CUDA, and assembly (ASM).
  • Deep Learning Integration: Experienced in integrating optimized GPU performance into machine learning frameworks (e.g., TensorFlow, PyTorch) to accelerate model training and inference.
  • Software Engineering: Skilled in Python and C++, with experience in debugging, performance tuning, and test design to ensure high-quality, maintainable software solutions.
  • High-Performance Computing: Solid experienced in running large-scale workloads on heterogeneous compute clusters, optimizing for efficiency and scalability.
  • Compiler Optimization: Foundational understanding of compiler theory and tools like LLVM and ROCm for kernel and system performance optimization.
  • Knowledge of AMD architectures (GCN, RDNA) and low-level programming to maximize performance for AI operations.
  • Experience with tools like Compute Kernel (CK), CUTLASS, and Triton for multi-GPU and multi-platform performance.

Responsibilities

  • Optimize Deep Learning Frameworks: Enhance and optimize frameworks like TensorFlow and PyTorch for AMD GPUs in open-source repositories.
  • Develop GPU Kernels: Create and optimize GPU kernels to maximize performance for specific AI operations.
  • Develop & Optimize Models: Design and optimize deep learning models specifically for AMD GPU performance.
  • Collaborate with GPU Library Teams: Work closely with internal teams to analyze and improve training and inference performance on AMD GPUs.
  • Collaborate with Open-Source Maintainers: Engage with framework maintainers to ensure code changes are aligned with requirements and integrated upstream.
  • Work in Distributed Computing Environments: Optimize deep learning performance on both scale-up (multi-GPU) and scale-out (multi-node) systems.
  • Utilize Cutting-Edge Compiler Tech: Leverage advanced compiler technologies to improve deep learning performance.

Other

  • Bachelor’s and/or Master’s Degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field.
  • Strong problem-solving skills, a proactive approach, and a keen understanding of software engineering best practices are essential.