Riot Games needs to efficiently operate games played by millions of players worldwide by building and maintaining a pipeline for operational metrics data. The goal is to ensure games are operable and observable, enabling quick responses to failures and minimizing impact on player experience.
Requirements
- 4+ years of experience building, deploying and operating features end-to-end within an existing large system
- Experience driving software engineering best practices within the team, including design reviews, coding standards, code reviews, tools improvements, source control management, build processes, and testing
- Understand distributed systems, microservices, and software at high scale
- Comfortable using whichever language/framework is necessary for the job
- Experience with distributed systems, specifically microservices
- Experience working in container-based ecosystems and with a container scheduler (e.g. Marathon, Mesos, Kubernetes, GKE, Amazon ECS)
- Experience with Java and Go
Responsibilities
- Lead the creation of Riot wide standards and best practices for alerting and monitoring
- Create and operate tools and services that help achieve operational excellence
- Communicate at a technical level with other development teams to help improve their services
- Characterize and identify system problems both within operations as well as our tooling and services
- Mentor software engineers through code and technical design reviews
- Identify and propose fixes for systemic issues
- Provide ongoing maintenance, support and enhancements in existing platforms
Other
- Ability to participate in an on-call rotation to ensure 24/7 system availability and handle critical incidents
- Collaborative spirit and decision-making that prioritizes fellow Rioters
- Flexible work schedules and open paid time off policy
- Medical, dental, and life insurance
- Parental leave for you, your spouse/domestic partner, and children