Zions Bancorporation's Enterprise Technology and Operations (ETO) team is seeking to transform its operations and ensure the reliability, scalability, and performance of its systems by establishing and expanding the Site Reliability Engineering (SRE) discipline within the organization.
Requirements
- 6+ years of experience in Site Reliability Engineering, DevOps, Infrastructure as Code (IaC), Application Development, Technology Operations, CI/CD pipelines, and automated testing frameworks.
- Proven expertise in troubleshooting complex, multi-layered technical issues across infrastructure, networking, application, and data layers.
- Hands-on experience with observability platforms (e.g., Datadog, Prometheus, Grafana, Splunk, New Relic) and a strong understanding of telemetry, metrics, logging, and tracing.
- Strong knowledge of cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes).
- Proficiency in scripting and automation using Python, Bash, Go, or similar languages.
- Solid understanding of networking, security, and infrastructure best practices.
- Strong background in network architecture, including experience with load balancers, firewalls, VPNs, and network troubleshooting tools.
Responsibilities
- Assist in the establishment and development of the SRE function, including defining processes, standards, and best practices.
- Design, implement, and maintain scalable and reliable infrastructure.
- Monitor system performance and troubleshoot issues to ensure high availability and resiliency.
- Develop and maintain automation tools to streamline operations and reduce manual intervention.
- Collaborate with development & Infrastructure teams to design and implement solutions that enhance system reliability.
- Primary resource for addressing difficult and complex problems, owning and driving them to completion while applying lessons learned across all enterprise services.
- Implement and manage monitoring, logging, and alerting systems.
Other
- Participate in on-call rotations to provide 24/7 support for critical systems.
- Collaborate with Enterprise Architecture to ensure adherence to current Architecture Standards.
- Relevant certifications (e.g., AWS Certified DevOps Engineer, Kubernetes Certified Administrator preferred
- Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field.
- Apply now if you have a passion for impactful outcomes, enjoy working collaboratively with co-workers, and want to make a difference for the clients and communities we serve.