Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

GPU Software Architecture Engineer

Apple

Salary not specified

Nov 9, 2025

Cupertino, CA, US

Apple Silicon GPU SW architecture team is seeking a senior/principal engineer to lead server-side ML acceleration and multi-node distribution initiatives to help define and shape our future GPU compute infrastructure on Private Cloud Compute that enables Apple Intelligence.

Requirements

Strong knowledge of GPU programming (CUDA, ROCm) and high-performance computing
Must have excellent system programming skills in C/C++, Python is a plus
Deep understanding of distributed systems and parallel computing architectures
Experience with inter-node communication technologies (InfiniBand, RDMA, NCCL) in the context of ML training/inference
Understand how tensor frameworks (PyTorch, JAX, TensorFlow) are used in distributed training/inference

Responsibilities

Design and implement tensor/data/expert parallelism strategies for large language model inference across distributed server cluster environments
Drive hardware and software roadmap decisions for ML acceleration
Expert in designing architectures that achieves peak compute utilizations and optimal memory throughput
Develop and optimize distributed inference systems with focus on latency, throughput, and resource efficiency across multiple nodes
Architect scalable ML serving infrastructure supporting dynamic model sharding, load balancing, and fault tolerance
Collaborate with hardware teams on next-generation accelerator requirements and software teams on framework integration
Lead performance analysis and optimization of ML workloads, identifying bottlenecks in compute, memory, and network subsystems

Other

Technical BS/MS degree
Apple is an equal opportunity employer that is committed to inclusion and diversity.
We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics.
Familiar with model development lifecycle from trained model to large scale production inference deployment
Proven track record in ML infrastructure at scale