Apple Information Security (AIS) is looking to build the next generation of its core software systems to create exceptional user experiences and scalable, well-architected solutions that drive the business forward.
Requirements
- Deep experience with fault-tolerant distributed systems, multi-region architectures, idempotency, backpressure, and resiliency patterns (circuit breaking, bulkheads).
- Containers, orchestration, and IaC: Docker, Kubernetes, CI/CD, Terraform (or equivalent).
- Cloud proficiency (AWS): compute, networking, IAM, object storage; strong cost-awareness and capacity planning.
- AWS media services, such as AWS Live & AWS Media Package.
- Datastores and messaging: SQL and NoSQL (e.g., Oracle, DynamoDB, MongoDB), caching (Redis), and streams (Kafka/Pub/Sub).
- Enterprise identity and security: OAuth2/OIDC/SAML, RBAC/ABAC, token-based auth, encryption in transit/at rest, auditing.
- Strong observability and operational excellence: metrics/logs/traces (OpenTelemetry), SLI/SLO/error budgets, on-call and incident management.
Responsibilities
- Own the end-to-end backend architecture: ingest, live/VOD encoding orchestration, packaging/manifest services, DRM/entitlements, playback APIs, and multi-region failover.
- Integrate with enterprise identity and policy: SSO (OIDC/SAML), audit trails, and policy enforcement for compliance and least-privilege.
- Engineer for peak events: capacity modeling, autoscaling, load shedding, graceful degradation, and DR playbooks that sustain company-wide concurrency spikes.
- Optimize delivery for enterprise networks: internal CDNs/edge caches, multi-CDN strategies, VPN-aware routing, caching, and bandwidth efficiency.
- Build reusable platform abstractions and APIs that enable comms, HR/training, and product teams to publish, schedule, and measure content safely and reliably.
- Partner with Security, IT/Networking, Client/Player, Data/Analytics, Compliance/Legal, and Comms to align roadmap and ensure policy, privacy, and accessibility (captions/ASR) are first-class.
- Lead incident response for live events; drive postmortems and systemic reliability improvements.
Other
- Drive a platform roadmap aligned with Apple Information Security (AIS) and product stakeholders.
- Execute company-wide live event (e.g., all-hands) meeting, exceeding SLAs.
- Clear reliability posture: multi-region DR, tested failover, run-books, and reduced MTTR.
- Demonstrable mentorship and cross-team impact: stronger design rigor, docs, and velocity.
- Make build-vs-buy decisions and vendor integrations where appropriate; negotiate SLAs and own total cost of ownership.