Ivanti is looking to enhance its Site Reliability Engineering (SRE) team to improve the reliability, deployment, and continuous operation of its cloud services, specifically by advancing observability, release automation, and chaos engineering.
Requirements
- 10+ years of Site Reliability Engineering or DevOps experience in a cloud/SaaS environment
- Experience developing and optimizing processes to improve stability, performance, and cost of SaaS products
- Experience developing and measuring key performance indicators, service level indicators, and service level objectives (Observability)
- Experience managing and troubleshooting Java and .NET applications on Kubernetes on public cloud platforms (AWS or Azure preferred)
- Experience with deployment pipeline tools such as Ansible, Jenkins, and/or GitHub Actions
- Proficiency working and developing Infrastructure as Code (IaC)
Responsibilities
- Build and manage a team that runs deploys, runs, and secures Ivanti's production Software-as-a-Service (SaaS) environments in AWS and Azure
- Improving the performance of the team to deliver improved stability, performance, and lower costs for the SaaS products
- Automating common and repetitive tasks
- Writing documentation and training material
- Training the teams
- Participating in manager on-call rotations for 24x7 coverage (follow-the-sun model) for incident response, issue triage, and problem resolution
Other
- US citizenship and must be located domestically in the U.S.
- 3+ years of leadership/managerial industry experience
- Experience building and managing geographically separated teams
- Experience running agile-based projects
- Friendly flexible working model