Apple is looking to build and optimize large-scale foundation models to power various services and products, requiring a skilled engineer to develop and deploy these models.
Requirements
- Experience with high-throughput services particularly at supercomputing scale
- Proficient with running applications on Cloud (AWS / Azure or equivalent) using Kubernetes, Docker etc.
- Proficient in building and maintaining systems written in modern languages (e.g., Golang, Python)
- Familiar with one of the popular ML Frameworks like Pytorch, Tensorflow
- Familiar with fundamental Deep Learning architectures such as Transformers, Encoder/Decoder models
- Familiarity with Nvidia TensorRT-LLM, vLLLM, DeepSpeed, Nvidia Triton Server etc.
Responsibilities
- Work closely with product teams to build production-grade solutions to launch models serving millions of customers in real-time
- Work alongside Foundation Model Research team to prototype and develop inference for cutting-edge model architectures
- Build tools to understand bottlenecks in Inference for different hardware and use cases
Other
- Bachelor’s degree or higher in Computer Science or related technical field
- 2 year+ industry experience in ML technologies (LLMs, Machine Learning, NLP, Information Retrieval, Statistics)
- Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition