The SAS Decision Builder team is looking for a Sr Site Reliability Engineer to lead the design and implementation of scalable, reliable, and secure infrastructure as we transition our analytical software to a SaaS delivery model. This is a pivotal role where you will shape the future of our cloud operations, establish best practices, and ensure our systems are resilient and performant.
Requirements
- Hands on experience in one or more Object Oriented Programming Languages. (Python, GoLang or JavaScript)
- The ability to deep drive and automate manual processes and reducing toil
- Hands on experience with scaling and capacity planning customer facing microservices
- Hands on experience of Cloud Computing. (Azure )
- Experience running databases at scale in the cloud (I.e. postgres, mysql)
- Hands on experience with container orchestration systems and running production workloads. Kubernetes is preferred
- Hands on experience with Monitoring and Alerting solutions (I.e. DataDog)
Responsibilities
- Design and develop high quality, testable and scalable software solutions within established timelines while adhering to R&D best practices and processes.
- Work with the R&D team to design highly scalable and reliable services in the cloud.
- Build out state of the art monitoring and alerting tools.
- Participate in project scoping and scheduling; tracks progress of individual tasks and alerts stakeholders of issues blocking or preventing completion of task.
- Ensure quality through functional, unit and performance testing.
- Build tools and automation to improve internal processes within R&D
- Serve as a technical expert of cloud related technologies.
Other
- 8+ years of experience in software development/programming experience
- Bachelors degree in Computer Science or related field.
- General Understanding of Chaos Engineering methodologies
- Basic understanding of Agile methodologies
- Ability to think analytically and effectively communicate issues and solutions
- Availability to participate in a 24-hour, on-call service as part of a team rotation.
- Understands complex system architectures and infrastructures.
- Passion for automation, scalability, and building reliable systems from the ground up
- Solid understanding of networking, security, and identity management in cloud environments.