Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Principal Site Reliability Engineer - Contact Center

Wells Fargo

$159,000 - $305,000

Sep 10, 2025

Iselin, NJ, US • Charlotte, NC, USA • Chandler, AZ, USA • Irving, TX, USA

Wells Fargo is looking for a Principal Engineer focused on Site Reliability Engineering (SRE) to improve the reliability, scalability, and observability of their Contact Center Technology Telephony, Workforce Management, Data and Reporting Platforms. The role aims to address challenges in application gap analysis, telemetry and alarming, and guide the SRE organization through Observability, Automation, and Cloud Engineering transformation.

Requirements

7+ years of Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
7+ years of experience leading observability and monitoring tooling
7+ years in infrastructure (windows and Linux) support
5+ years proven success in toil reduction initiatives
5+ years in cloud application management
Ability to troubleshoot the network stack, operating system stack and middleware
Has set up distributed tracing across an internet topology for full health check and with the ability to pinpoint problem source.

Responsibilities

Perform thorough gap analysis of applications and implement or improve telemetry and affective alarming.
Guide the SRE organization through Observability, Automation and Cloud Engineering transformation.
Act as an advisor to leadership to develop or influence applications, network, information security, database, operating systems, or web technologies for highly complex business and technical needs across multiple groups
Lead the strategy and resolution of highly complex and unique challenges requiring in-depth evaluation across multiple areas or the enterprise, delivering solutions that are long-term, large-scale and require vision, creativity, innovation, advanced analytical and inductive thinking
Ensure high availability and performance of production systems through proactive monitoring and incident response.
Design and implement scalability, reliability, and observability strategies for cloud and on-premise environments.
Define SLIs (Service Level Indicators), SLOs (Service Level Objectives), and Error Budgets to improve system reliability.

Other

The ideal candidate must have strong communication skills and the ability to mentor, provide expert advice and upskill peer platform engineers.
The Principal Engineer will be called to production outage calls and be expected to reduce the mean time to resolve by providing senior level troubleshooting skills.
Provide vision, direction and expertise to leadership on implementing innovative and significant business solutions
Maintain knowledge of industry best practices and new technologies and recommend innovations that enhance operations or provide a competitive advantage to the organization
Strategically engage with all levels of professionals and managers across the enterprise and serve as an expert advisor to leadership