Apple's Platform Architecture group is responsible for connecting hardware and software into one unified system, driving the development of system-on-a-chip architecture, and developing forward-looking prototype systems for machine learning applications.
Requirements
- Ability to program in C/C++ and/or Python
- Knowledge of computer architecture fundamentals
- Domain knowledge in at least one hardware IP: ML HW accelerators or processing units such as GPU, image/video, CPUs, or similar
- Experience in efficient implementation of machine learning algorithms
- Experience in creating system or IP performance models/simulations
- Ability to prototype and benchmark algorithms on CPU/GPU/Neural Engine, analyze performance metrics and create high level complexity models
- Ability to develop hardware accelerator performance and bit accurate models
Responsibilities
- Create optimized implementations of ML workloads on Apple silicon including Neural Engine, GPU and CPU.
- Collaborate with IP and SoC architecture teams to develop performance models and simulations of future hardware.
- Conduct performance studies to inform and validate architecture decisions.
- Collaborate with system team to create high level performance models of emerging ML techniques and analyze system architecture trade-offs.
- Explore different ways of mapping ML workloads to Apple silicon and develop performance models/simulations.
- Inform and validate architecture decisions.
- Gain insights on how to make workloads run efficiently on our IPs and SoCs and communicate what we learn to software and algorithm teams.
Other
- Bachelor's degree
- MS or PhD in EE/CE/CS or related field, or 3+ years of relevant experience
- Verbal and written communication skills for collaborating with partner teams
- Familiarity with deep learning frameworks such as PyTorch
- Understanding of compiler frameworks/technologies