Apple is looking to design and scale the platforms that power its Siri, Search, and AI/ML ecosystems by building the next generation of cloud-native ML infrastructure.
Requirements
- Strong programming experience in Golang and/or Rust; expertise in building controllers, operators, or automation systems.
- Deep understanding of Kubernetes internals, controller-runtime, and Crossplane composition frameworks.
- Experience with ArgoCD, Helm, and Infrastructure-as-Code (Terraform, Pulumi, or Crossplane).
- Hands-on experience with GitOps, declarative configuration, and reconciliation-driven workflows.
- Proven ability to design and operate infrastructure for ML training and inference, including performance tuning and GPU optimization.
- Deep expertise in Kubernetes API machinery, custom resources (CRDs), and control plane development.
- Experience with Model Context Protocol (MCP)-based systems or contextual orchestration servers.
Responsibilities
- Architect and develop cloud-native, agentic infrastructure platforms supporting ML training, inference, and large-scale distributed systems.
- Lead and mentor engineers building Crossplane-based control planes, Kubernetes operators, and ArgoCD-driven GitOps automation.
- Design, build, and optimize Model Context Protocol (MCP) servers that manage and contextualize infrastructure and application state across environments.
- Contribute to and upstream improvements in open-source CNCF projects, representing Apple in the cloud-native community.
- Implement observability, governance, and automation frameworks to ensure performance, reliability, and compliance.
- Collaborate with AI/ML and infrastructure teams to integrate agentic orchestration workflows for self-service provisioning, ML pipeline management, and dynamic scaling.
- Drive best practices for GitOps, IaC, and Kubernetes cluster lifecycle automation at global scale.
Other
- Experience leading technical teams, driving architecture decisions, and mentoring engineers.
- Excellent communication, technical writing, and cross-functional leadership skills.
- BS/MS in Computer Science or related field (or equivalent practical experience).
- 9+ years in cloud infrastructure, SRE, or distributed systems roles.
- Active contributor to CNCF open-source projects (e.g., Kubernetes, Crossplane, ArgoCD, Envoy, Prometheus).