Hudl is looking to improve the reliability and performance of Hudl.com, which is considered their most important feature. The SiteOps team needs to scale DevOps, site reliability, security engineering, and FinOps best practices across the engineering team to achieve this.
Requirements
- Exposure to mature, full-stack web application code.You have 2+ years of experience building across many levels of a web application, from client-side code down to the database.
- Experienced with public clouds, preferably AWS.
- Knowledgeable in monitoring and observability technology.
- Experience working with hybrid teams.
Responsibilities
- Define, document, and drive adoption for the processes and tools used to improve production alerting and incident response sitewide at Hudl.
- Understand, evangelize, and help implement the best strategies and tools for faster discovery and resolution of production incidents.
- Help Hudl define and measure reliability metrics, such as MTTD, MTTR and availability. You’ll help teams become more accountable for individual microservice metrics.
- Collaborate and embed with teams to eliminate architectural weaknesses or anti-patterns across our systems.
- Take on-call shifts a few times a year.
- Build adoption for new observability technologies, process and system architecture.
Other
- A collaborative, team-first mindset.
- Experience independently navigating uncertainty.
- Curiosity.
- Believe in the “DevOps” philosophy.
- Champion work-life harmony.