Apple Media Search Platform team is looking to build the best music search experience across all Apple platforms, supporting millions of users and billions of transactions, by enhancing thousands of compute and big data pipelines to deliver greater scalability, reliability, and efficiency.
Requirements
- Proficiency in Scala, Python, and scripting languages.
- Experience in and solid understanding of distributed systems, performance tuning, and resource optimization.
- Strong hands-on expertise with Apache Spark and the Hadoop ecosystem.
- Experience developing or applying machine learning techniques or LLM-based agentic workflows for data pipeline optimization and data quality improvements.
- Knowledge of cost optimization strategies for big data infrastructure.
Responsibilities
- Develop automation and LLM-based agents to automatically increase testing coverage for data pipelines in a monorepo environment.
- Develop automation and LLM-based agents to optimize Spark job resource utilization, including both CPU and memory efficiency.
- Develop LLM-powered agents to automatically diagnose failures in large-scale data pipelines.
- Build tools and automation to accelerate engineer productivity across development, testing, and production deployment of new pipelines.
- Design and maintain dashboards to improve observability of pipeline execution and verification.
- Deliver cost-efficient solutions for storage and compute platform migrations through automation and advanced machine learning techniques.
Other
- Bachelor’s degree in Computer Science, Computer Engineering, or a related field.
- 3+ years of experience with large-scale data processing and pipelines.
- Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses - including tuition.