Wells Fargo is seeking a Senior Lead Systems Operations Engineer to join the Commercial and Corporate & Investment Banking Technology (CCIBT) organization to function as a Business Site Reliability Engineer (SRE) with enterprise-level influence, driving strategic initiatives across infrastructure platforms and serving as a trusted advisor to senior leadership, shaping the reliability strategy, architecture, and operating model for infrastructure platforms supporting business-critical applications.
Requirements
- 5+ years of Systems Engineering, Technology Infrastructure domains, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
- 5+ years of experience managing production support
- 3+ years of experience in production support or operations engineering
- 5+ years of experience in Systems Engineering, Infrastructure Operations, or related domains
- Deep understanding of SRE principles and practices
- Experience with enterprise monitoring and observability platforms (e.g., Splunk, Prometheus, Grafana)
- Expertise in automation tools
Responsibilities
- Lead complex, broad impact initiatives including provision of high-level systems consultation for the technology teams
- Work as key participant in large scale planning of computer systems and network infrastructure for Systems Operations functional area
- Review and analyze complex technical challenges, as well as escalated support issues related to core business solutions that require in depth evaluation of multiple factors, such as alternatives, enhancements, periodic systems reviews, or improvements to existing systems
- Make decisions on technical changes and enhancements
- Consult with engineering team on change design requiring solid understanding of technical process controls or standards that influence and drive new initiatives
- Develop and implement enterprise-wide reliability engineering frameworks
- Lead strategic planning for infrastructure operations, migrations, and hybrid cloud adoption
Other
- Hybrid work schedule with on-site presence as needed
- Participation in on-call support and incident response activities
- Strong technical expertise, communication, and stakeholder engagement skills