Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

AMD Logo

Sr. Staff Software Development - RCCL, GPU communication libraries

AMD

Salary not specified
Sep 26, 2025
Austin, TX, US
Apply Now

Develop multi node GPU communication libraries to enable high performance computing and machine learning workloads at Exascale for AMD.

Requirements

  • Strong background developing applications and libraries in C, C++, and Python
  • Experience working with RoCE(RDMA over Converge Ethernet), Libfabric and InfiniBand
  • Experience working with Linux Kerner, Device drivers and network drivers.
  • Experience designing and building GPU Networks for Large Scale Clusters
  • Experience in collective communication libraries: MPI, RCCL, SHMEM and optimization to scale collective communication to scale distributed systems.
  • In-depth knowledge of best-practices in software development, including testing, profiling, debugging, documentation, version control, issue tracking, and planning
  • GPU software development using HIP, CUDA, or OpenCL

Responsibilities

  • Support AMD’s RCCL, an open source, GPU-accelerated communication collective middleware and related technologies
  • Design, implement, and test networking features for multi-GPU and multi-node communication libraries.
  • Benchmark, profile and optimize code to maximize throughput on single-GPU, multi-GPU and clustered systems
  • Deliver high-quality code and documentation following best practices for open source software development
  • Work with key technical experts across AMD and with our partners and customers to improve ROCm applications, libraries, and tools
  • Deploy the libraries on large clusters and debug complex system level issues that could span across different layers of the software stack: gpu kernel drivers, nic driver etc.

Other

  • Accustomed to working in a dynamic, geographically distributed agile team, where partnership and collaboration are paramount.
  • Possess excellent written and verbal communication skills, strong attention to detail, and the ability to express your work in a clear, cohesive fashion.
  • Results-oriented and accustomed to tight deadlines and changing priorities.
  • Constantly thinking of ways to improve performance of software and hardware.
  • B.Sc. or B.Eng. degree in Computer Science, Software Engineering, Electrical Engineering, or equivalent