Google's services need to have reliability, uptime appropriate to users' needs and a fast rate of improvement. Site Reliability Development ensures this by combining software and systems development to build and run large-scale, massively distributed, fault-tolerant systems, while also keeping an eye on system capacity and performance.
Requirements
- 1 year of experience with software development in one or more programming languages during coursework/projects, research, internships, or practical experience in school, work, or Open Source projects.
- 1 year of experience with data structures or algorithms.
Responsibilities
- Write product or system development code.
- Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).
- Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback.
- Triage product or system issues and debug/track/resolve by analyzing the sources of issues and the impact on hardware, network, or service operations and quality.
- Participate in, or lead design reviews with peers and stakeholders to decide amongst available technologies.
- design, develop, test, deploy, maintain, and enhance software solutions.
Other
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
- Master's degree in Computer Science or Engineering, or a related field.
- manage project priorities, deadlines, and deliverables.
- intellectual curiosity, problem solving and openness
- collaborate, think big and take risks in a blame-free environment.