Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Palo Alto Networks Logo

Senior Site Reliability Engineer

Palo Alto Networks

Salary not specified
Sep 9, 2025
Santa Clara, CA, USA
Apply Now

The Cortex team builds and delivers the industry’s most advanced SecOps platform, consisting of XSIAM, XSOAR, and XPANSE. As a member of the Cortex DevOps team, your role involves operating and maintaining a large-scale GCP environment, including the design, implementation, and continuous enhancement of our comprehensive observability systems.

Requirements

  • High proficiency in either Google Cloud Platform or Amazon Web Services
  • High proficiency with Kubernetes and Docker for container orchestration
  • High proficiency in Python programming and Linux Shell commands
  • Experience with Terraform for infrastructure as code
  • Strong grasp of security concepts and best practices
  • Experience with observability and incident response tools
  • Ability to effectively troubleshoot and address emerging and complex problems

Responsibilities

  • operating and maintaining a large-scale GCP environment, including the design, implementation, and continuous enhancement of our comprehensive observability systems
  • Utilize your expertise in monitoring cloud platforms, particularly GCP, to optimize our infrastructure leveraging cloud-native technologies
  • Leverage incident management processes to ensure efficient resolution of system issues and minimal impact on services
  • Automate complex monitoring and alerting tasks by building tools for cloud operations, such as automated remediation of known issues and auto-scaling
  • Develop and maintain application deployment tools such as Terraform and Helm
  • Stay up-to-date with cutting-edge technologies, evaluate their potential impact on our operations, and implement them when appropriate
  • Work with our Engineering team to influence the operability of the product and ensure the reliability and availability of our services

Other

  • This role requires a US Citizen due to FedRAMP High requirements.
  • Clear understanding of incident and alerts management in Site Reliability Engineering
  • 4+ years of experience as a DevOps/SRE engineer with a passion for technology and a strong motivation for high reliability at the service level
  • Effective communication and interpersonal skills, with the ability to work and coordinate between multiple teams
  • Ability to operate independently, make decisions, take action, and take responsibility