Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Software Developer 3

Oracle

$79,200 - $178,100

Dec 3, 2025

Santa Clara, CA, US

Oracle Cloud Infrastructure (OCI) Cluster Networking team is building an ultra-high-performance network to support AI/ML/HPC workloads. The team needs to design systems that scale from tens to hundreds of thousands of GPUs without sacrificing performance.

Requirements

Strong knowledge and practical experience with NCCL is essential for this role.
1+ years of experience with collective communications libraries like NCCL, RCCL, MPI and GPU frameworks like CUDA and ROCm.
1+ years of experience with ML training frameworks like PyTorch, TensorFlow
Proficient at programming in any two out of C/C++, Python, Java, Scala, GO
Proficient with data structures, algorithms, operating systems
Experience with RDMA programming, including but not limited to GPUDirect RDMA
Experience with distributed workload managers like Slurm or K8s

Responsibilities

design systems that scale from tens to hundreds of thousands of GPUs without sacrificing performance
develops and tunes the software and hardware stack for distributed workloads using libraries such as NCCL on high-speed networks
apply collective communication libraries to tune system performance at a previously unheard-of scale
write solid code
work across the stack
Experience with RDMA programming, including but not limited to GPUDirect RDMA
Experience with distributed workload managers like Slurm or K8s

Other

5+ years of experience with software (systems/application) development
Excellent organizational, verbal, and written communication skills
Bachelors in computer science and Engineering or related engineering fields
Masters / PhD degree in Computer Science or related engineering fields
adaptable, self-motivated engineers who learn quickly