General Motors is looking to solve the problem of system reliability and efficiency by hiring a Site Reliability Engineering Manager to lead a team in setting priorities and ensuring alignment with organizational goals.
Requirements
- Proficiency in at least one programming language (e.g., Python, Go, Java) and familiarity with multiple language ecosystems.
- Solid understanding of operating systems, networking, distributed systems, databases, and storage architectures.
- Deep understanding of how code runs on underlying hardware, including operating systems, algorithms, and data structures.
- Experience handling production incidents, including root cause analysis, mitigation, and working through complex system failures.
- Experience with cloud platforms (AWS, GCP, Azure).
- Familiarity with container orchestration systems like Kubernetes.
- A track record of managing or developing distributed systems.
Responsibilities
- Develop tools and software to automate operational processes, improve system reliability, and reduce manual intervention.
- Lead, Implement and improve monitoring and observability frameworks, enabling proactive detection and resolution of incidents.
- Participate in an on-call rotation to diagnose, troubleshoot, and mitigate production incidents, ensuring minimal downtime and swift resolution.
- Work alongside developers to ensure the quality, scalability, and reliability of our services.
- Manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to manage reliability expectations effectively.
- Conduct deep-dive analyses of incidents and collaborate on post-incident reviews to derive learnings and prevent recurrence.
- Evaluate system performance and advocate for optimizations that reduce infrastructure costs while maintaining service reliability.
Other
- Bachelor’s degree in computer science or related fields or equivalent work experience.
- 8+ years of experience in software development teams.
- Strong communication skills, with an ability to explain technical concepts to both engineering and business stakeholders.
- Commitment to collaborative problem-solving and shared ownership of services.
- GM DOES NOT PROVIDE IMMIGRATION-RELATED SPONSORSHIP FOR THIS ROLE.