Uniphore is looking for a Staff Site Reliability Engineer (SRE) to join their Platform Engineering team to build standards, frameworks, and self-service capabilities that enable feature teams to own and operate their services end-to-end with enterprise-grade reliability and security.
Requirements
- 10+ years in DevOps/SRE roles with proven experience transforming operational models
- Deep incident management, RCA processes, and on-call system design experience
- Expert-level AWS, Kubernetes, Infrastructure as Code (Terraform), cost optimization
- Strong technical writing skills, ability to create comprehensive operational procedures
- Experience with RFC/PRD review processes, API design patterns, service architecture
- Track record of enabling team capabilities rather than doing work for teams
- Proficiency in scripting (Go, Python, Bash, etc.) and CI/CD systems
Responsibilities
- Review RFCs and PRDs to prevent downstream issues, provide architectural guidance during planning phases
- Create documentation, knowledge bases, and tooling that eliminate support dependencies
- Design incident response frameworks, escalation procedures, and comprehensive playbooks that teams can self-execute
- Define technical standards, operational frameworks, and service readiness criteria that enable team autonomy
- Guide teams through ownership maturity, scorecard compliance, and operational best practices
- Maintain multi-tenant, multi-cloud infrastructure while building the systems that make your expertise scalable
Other
- Drive our ongoing transformation from reactive support to strategic platform engineering partnership
- Accelerate our standards framework that's enabling engineering teams across the organization to operate independently
- Scale our self-service capabilities that are already reducing support escalations and increasing team velocity
- Dive deep into complex technical challenges - identify root causes and build solutions that prevent entire classes of problems across the platform
- Apply high-leverage engineering - your solutions will multiply across dozens of services and teams, creating exponential impact