Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Software Delivery - Site Reliability Engineer

Apple

$141,800 - $258,600

Sep 12, 2025

San Diego, CA, US

Apple is looking to build the next generation of release technologies that power Apple's development lifecycle and shape the future of how Apple delivers software to millions of customers.

Requirements

Experience as a Site Reliability Engineer, DevOps Engineer, or Software Engineer focused on infrastructure in a large-scale distributed environment.
Strong software development skills in a language like Swift, Go, or Python, and a high degree of comfort with shell scripting (Bash).
Hands-on experience building and managing systems with container orchestration tools (Kubernetes, Docker).
Deep understanding of networking (TCP/IP, DNS, HTTP) and experience using observability tools (monitoring, logging, tracing) to diagnose complex issues.
Expertise in performance analysis and capacity planning for global, distributed systems.
Experience with large-scale distributed databases (e.g., Cassandra, FoundationDB) or messaging systems (e.g., Kafka).
Familiarity with using Generative AI (GenAI) or Large Language Models (LLMs) to accelerate operational tasks, such as automating runbooks, generating scripts, or analyzing incident data.

Responsibilities

Design, build, and maintain robust, scalable, and observable systems for our core software delivery services.
Reduce operational toil by developing automation and tooling to prevent and rapidly resolve production issues.
Own and refine our incident management processes to ensure high availability.
Partner with development teams to create elegant, high-quality solutions that support the entire workflow, from source code to customer release.
Use a proactive approach to identify and eliminate technical debt to enhance long-term reliability and maintainability.
Proven experience leading initiatives to reduce technical debt, refactor systems, or improve performance and latency.
Demonstrated ability to lead incident response for high-impact outages.

Other

The most important thing is a deep commitment to building reliable systems and strong collaboration with team members across different timezones.
Excellent problem-solving and communication skills, with a strong sense of ownership and drive.
We know that great talent comes from a variety of backgrounds, and we encourage you to apply even if you don’t meet every single requirement.