Palo Alto Networks is looking for a Principal SRE to build and maintain highly reliable, scalable, and secure cloud infrastructure within a FedRAMP compliant environment, driving operational excellence and SRE best practices.
Requirements
- 4+ years of experience with AWS and GCP and expertise in their architecture, services, advanced cloud networking, and PKI concepts.
- Expertise in troubleshooting and resolving cloud infrastructure and service issues, identifying root cause and devising effective solutions for high volume transactions.
- Proficiency with Python and shell scripting for automation; Golang is a plus.
- Proficiency in Infrastructure as Code (IaC) with Terraform and Helm, leveraging AI tools for development.
- Solid experience with Kubernetes, container networking, and container workloads.
- Strong Linux administration skills.
- Proficiency with CI/CD pipelines, GitOps principles, GitLab, and Jenkins.
Responsibilities
- Design, build, and operate reliable, secure Cloud infrastructure across multi-cloud environments.
- Ensure applications are production-ready, scalable, and resilient, collaborating closely with developers, researchers, data scientists, and security experts.
- Develop expertise in new technologies and rapidly integrate them into our existing infrastructure, embracing continuous learning and the adoption of AI tools.
- Develop tools and automation frameworks, championing Infrastructure as Code (IaC) and Monitoring as Code (MaC) principles.
- Automate robust deployments and orchestrate end-to-end monitoring and alerting solutions.
- Participate in on-call rotations with SRE and Dev teams to support critical business and production systems.
- Lead root cause analysis of critical business and production issues, driving improvements and preventing recurrence.
Other
- Due to government environments this position supports, the role requires US Citizenship.
- Must be a US Citizen to be considered
- BS or MS in Computer Science, a related field, or equivalent professional experience required or equivalent military experience required
- Excellent written and verbal communication skills, with the ability to collaborate effectively and rally support across teams.
- Self-disciplined, self-managed, and highly driven with a strong sense of ownership and urgency.
- Ability to adapt quickly to evolving cloud technologies, security threats, and advancements through continuous learning.
- Able to understand and address customer needs effectively, and provide RCA to customers.
- Understanding how technical decisions impact the business and aligning cloud operations with business goals.