Apple's Cloud network Infrastructure team is looking to support and scale cloud services by hiring a Site Reliability Engineer to maintain high availability, scale, and resilience of cloud network services.
Requirements
- Experience in crafting and operationalizing large scale distributed, fault-tolerant, multi-tenant services
- Experience with operating systems and network fundamentals
- Experience in API design and interface technologies (JSON, ProtoBuf, REST, RPC, XML, etc)
- Expert knowledge of API design and interface technologies
- Strong systems programming skills including multi-threading, concurrency, caching, batching
- In depth knowledge of K8s, system virtualization, build systems and infrastructure as code
Responsibilities
- support activities such as system design engineering, developing software tools and platforms, managing/planning capacity, and conducting launch reviews to ensure readiness
- Maintain service quality via monitoring and improving availability, performance and health.
- Proactive designs and process implementations to mitigate risk, reduce impact radius, incident detection and resolution times.
- Deliver on a sustainable incident response practices learning from experiences through blameless postmortems.
- Collaborate with cross-functional teams in driving service integrations, resolving dependencies and representing the service offerings.
Other
- highly self-motivated with a passion for excellence, quality and detail.
- Strong record of leading large multi-functional projects
- Outstanding communication skills with the ability to articulate concepts, designs and decisions.
- The base pay range for this role is between $181,100 and $318,400
- Apple is an equal opportunity employer that is committed to inclusion and diversity.