Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AMD Logo

AI/ML Validation Engineer

AMD

Salary not specified
Oct 25, 2025
San Jose, CA, United States of America
Apply Now

At AMD, the business problem is to accelerate next-generation computing experiences, from AI and data centers to PCs, gaming, and embedded systems, by building great products and solving the world's most important challenges.

Requirements

  • Good experience with complex compute systems used in AI, HPC deployments, backend network designs in RDMA clusters
  • Experience in validating complex AI infrastructure - GPUs, networking, ROCEv2, UEC, running benchmark tests like IBPerf benchmarking, RCCL, NCCL
  • Experience with running training of LLMs, MoE models, Image Generation, recommendations models with different frameworks like PyTorch, Tensorflow, Megatron-LM, JAX
  • Experience with running inference workloads in AI clusters with different inference frameworks like vLLM, SGLang
  • Experience with distributed systems and schedulers like Kubernetes, Slurm
  • Ability to write high-quality automation frameworks and scripts using Python or Golang
  • Experience with performance profiling of CPUs, GPUs and debugging complex compute, network, storage problems

Responsibilities

  • Work with AMD’s architecture specialists to validate AI solutions for distributed training and inference workloads with AMD's ROCM software
  • Build cluster scale automation for distributed training and inference workloads
  • Publish reference designs and benchmark numbers for AI workloads
  • Apply a data-minded approach to target optimization efforts
  • Design and develop new groundbreaking AMD technologies
  • Participating in new ASIC and hardware bring-ups
  • Develop technical relationships with peers and partners

Other

  • Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent
  • Effective communication and problem-solving skills
  • Leadership skills to drive sophisticated issues to resolution
  • Ability to communicate effectively and work optimally with different teams across AMD
  • AMD benefits at a glance