Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Software Developer 3

Oracle

$79,200 - $178,100

Dec 3, 2025

Seattle, WA, US

Oracle Cloud Infrastructure (OCI) Cluster Networking team is building an ultra-high-performance network to support AI/ML/HPC workloads, and is looking to design systems that scale from tens to hundreds of thousands of GPUs without sacrificing performance.

Requirements

Strong knowledge and practical experience with NCCL
Experience with collective communications libraries like NCCL, RCCL, MPI and GPU frameworks like CUDA and ROCm
Experience with ML training frameworks like PyTorch, TensorFlow
Proficient at programming in any two out of C/C++, Python, Java, Scala, GO
Proficient with data structures, algorithms, operating systems
Experience with RDMA programming, including but not limited to GPUDirect RDMA
Experience with distributed workload managers like Slurm or K8s

Responsibilities

Design systems that scale from tens to hundreds of thousands of GPUs without sacrificing performance
Develop and tune the software and hardware stack for distributed workloads using libraries such as NCCL on high-speed networks
Apply collective communication libraries to tune system performance at a previously unheard-of scale
Work across the stack to develop and tune the software and hardware stack for distributed workloads
Collaborate with other engineers to design and implement ultra-high-performance networks
Use collective communications libraries like NCCL, RCCL, MPI and GPU frameworks like CUDA and ROCm
Work with ML training frameworks like PyTorch, TensorFlow

Other

Bachelors in computer science and Engineering or related engineering fields
Excellent organizational, verbal, and written communication skills
5+ years of experience with software (systems/application) development
Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates
Adaptable, self-motivated engineers who learn quickly and work across the stack