Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Senior Member of Technical Staff Software Engineer- GPU, LLM, AI

AMD

Salary not specified

Sep 12, 2025

Santa Clara, CA, US

AMD is looking to tackle one of the most exciting challenges in the industry: training and running AI to make AI itself more efficient on GPUs on the fly, which can dramatically alter the trajectory of AI progress.

Requirements

high-performance C++ software engineering and low-level GPU programming with a robust understanding of Large Language Models (LLMs) and AI systems.
bridge kernel engineering with AI post-training (RL) experience.
designing complex, scalable systems using modern C++
fundamental grasp of GPU architectures (HIP/CUDA), memory hierarchies, and kernel optimization to maximize hardware performance.
significant hands-on experience in large-scale C++/HIP/CUDA projects, such as contributing to the ROCm ecosystem (e.g., rocBLAS, hipDNN, Composable Kernel, AITemplate), CUDA libraries (e.g., cuBLAS, cuDNN, CUTLASS, Thrust, CUB, NCCL), or the C++/HIP/CUDA core of ML frameworks like PyTorch, TensorFlow, or JAX.
deep understanding of LLMs, including but not limited to transformer architectures, attention mechanisms, and the full model lifecycle, with hands-on experience in advanced model alignment and post-training techniques like Supervised Fine-Tuning (SFT) and Reinforcement Learning (e.g., RLHF, GRPO).
familiarity with cutting-edge trends such as Mixture-of-Experts (MoE) architectures, inference optimizations (e.g., quantization, speculative decoding), and modern application patterns like Agentic AI systems (e.g. AlphaEvolve for code/kernel generation).

Responsibilities

Architect and Drive the AI Software Stack: You will establish best practices and optimize performance from the lowest-level GPU kernels to large-scale distributed systems, shaping the foundational software for AMD hardware.
By leveraging cutting-edge Large Language Models (LLMs) and agent-based technologies, you will accelerate the development and performance enhancement of the AMD ROCm ecosystem, ensuring it remains at the forefront of AI innovation.
Accelerate Foundational Models: Your work will directly accelerate cutting-edge applications like foundation models (LLMs) and autonomous AI agents, ensuring AMD is the platform of choice for the most demanding workloads.
Innovate Across Hardware and Software: You will contribute to the entire co-design lifecycle, from influencing future GPU architectures to developing groundbreaking software for new accelerators and collaborating with the broader AI community.
As a senior engineer, you will also be expected to mentor others and effectively communicate your ideas to shape the future of AI at AMD.
demonstrating mastery in designing complex, scalable systems using modern C++, coupled with a fundamental grasp of GPU architectures (HIP/CUDA), memory hierarchies, and kernel optimization to maximize hardware performance.
deep understanding of LLMs, including but not limited to transformer architectures, attention mechanisms, and the full model lifecycle, with hands-on experience in advanced model alignment and post-training techniques like Supervised Fine-Tuning (SFT) and Reinforcement Learning (e.g., RLHF, GRPO).

Other

strong technical ownership, communication, and problem-solving skills with a track record of delivering complex technical solutions.
Success in this role requires a deep passion for software engineering, strong technical ownership to see complex problems through to resolution, and the ability to influence technical direction across teams.
Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent.
Master's degree preferred, PhD is a plus.
Relevant publications in AI/ML, GPU computing, or system optimization are highly valued.