McKesson is looking for a Reliability Engineer to join their SRE team to build and maintain the infrastructure that powers their applications, ensuring they are reliable, scalable, and secure.
Requirements
- Hands-on experience with Azure, Kubernetes (AKS), and Terraform.
- Familiarity with HashiCorp Vault, GitHub, and scripting languages like Python or Ruby.
- Experience with ArgoCD.
- Experience with Puppet or similar configuration management tools.
- Experience with monitoring and observability tools (e.g., Datadog, Prometheus, Grafana, Azure Monitor).
- Exposure to CI/CD practices and tools.
- Familiarity with cloud security best practices.
Responsibilities
- Support and maintain infrastructure in Azure Public Cloud, with a focus on scalability and reliability.
- Manage and optimize Kubernetes clusters (AKS) for high availability and performance.
- Develop and maintain infrastructure as code using Terraform HCP.
- Implement and manage secrets and access control using HashiCorp Vault.
- Collaborate with development teams to improve CI/CD pipelines using GitHub Actions.
- Write automation scripts in Python or Ruby to streamline operations and reduce manual work.
- Maintain configuration management using Puppet.
Other
- Degree or equivalent and typically requires 2+ years of relevant experience.
- 1–2 years of experience in a Site Reliability, DevOps, or Infrastructure Engineering role.
- Understanding of Agile principles and experience working in Kanban teams.
- Strong problem-solving skills and a passion for automation and reliability.
- McKesson is an Equal Opportunity Employer