Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AI/ML Validation Engineer

AMD

Salary not specified

Oct 25, 2025

San Jose, CA, United States of America

At AMD, the business problem is to accelerate next-generation computing experiences, from AI and data centers to PCs, gaming, and embedded systems, by building great products and solving the world's most important challenges.

Requirements

Good experience with complex compute systems used in AI, HPC deployments, backend network designs in RDMA clusters
Experience in validating complex AI infrastructure - GPUs, networking, ROCEv2, UEC, running benchmark tests like IBPerf benchmarking, RCCL, NCCL
Experience with running training of LLMs, MoE models, Image Generation, recommendations models with different frameworks like PyTorch, Tensorflow, Megatron-LM, JAX
Experience with running inference workloads in AI clusters with different inference frameworks like vLLM, SGLang
Experience with distributed systems and schedulers like Kubernetes, Slurm
Ability to write high-quality automation frameworks and scripts using Python or Golang
Experience with performance profiling of CPUs, GPUs and debugging complex compute, network, storage problems

Responsibilities

Work with AMD’s architecture specialists to validate AI solutions for distributed training and inference workloads with AMD's ROCM software
Build cluster scale automation for distributed training and inference workloads
Publish reference designs and benchmark numbers for AI workloads
Apply a data-minded approach to target optimization efforts
Design and develop new groundbreaking AMD technologies
Participating in new ASIC and hardware bring-ups
Develop technical relationships with peers and partners

Other

Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
Effective communication and problem-solving skills
Leadership skills to drive sophisticated issues to resolution
Ability to communicate effectively and work optimally with different teams across AMD
AMD benefits at a glance