The Core OS team at Apple is looking to solve the problem of achieving world-class reliability for Apple's operating systems by improving panic triage and developer tools.
Requirements
- Strong technical depth in operating system internals (kernel, drivers, memory management, logging, crash reports).
- Hands-on experience with panic log decoding, symbolication pipelines, or related telemetry systems.
- Experience designing and managing triage systems that classify and prioritize issues at scale.
- Proficiency in tool development using Python, Swift, C/C++, for automation and diagnostics.
- Familiarity with crash reporting frameworks (e.g., CoreDump, PanicReporter).
- Experience with kernel instrumentation or equivalent CPU tracing technologies.
- Knowledge of ML-based log classification or anomaly detection techniques.
Responsibilities
- Lead a team of engineers responsible for triaging system panics and critical kernel-level issues across macOS, iOS, watchOS, and tvOS.
- Build and evolve automated pipelines that detect, de-duplicate, and prioritize panic signatures for engineering teams.
- Develop internal diagnostic and debugging tools used across Apple to analyze device health, kernel events, and crash telemetry.
- Partner with Core OS, Hardware, and other SWE teams to improve on-device diagnostics, including lightweight monitors and self-healing systems.
- Drive post-mortem analysis and root-cause tracking, ensuring regressions are detected early and fixed efficiently.
- Champion improvements in panic data quality, symbolication coverage, and automation of triage workflows.
- Mentor engineers in systems debugging and diagnostic tool design.
Other
- BS/MS in Computer Science, Electrical Engineering, or equivalent experience.
- 8+ years of experience in systems software, embedded, or diagnostics development; 2+ years in technical leadership or management preferred.
- Attract, develop, and retain top talent while fostering a technical culture of innovation, collaboration, and excellence.
- Nurture and grow technical leaders within the SHIP organization.
- Drive engineering quality, scalability, and reliability for the platform.