Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

SRE Manager

o9 Solutions

$149,818 - $205,999

Oct 1, 2025

Dallas, TX, USA

o9 is looking to solve the problem of transforming decision-making through an AI-first approach, integrating siloed planning capabilities, and capturing value leakage to help businesses plan smarter and faster, thereby enhancing operational efficiency and reducing waste.

Requirements

Strong knowledge of cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes).
Expertise in observability tools (Prometheus, Grafana, Datadog, etc.) and incident management platforms.
Experience with configuration management tools (Terraform, Ansible, Helm, etc.).
Solid understanding of networking, security, Linux internals, and distributed systems.
Relevant cloud certifications (AWS, Azure, or GCP) strongly preferred.
Kubernetes Administration (CKA) certification is a plus.
Experience operating complex, cloud-native production systems at scale.

Responsibilities

Hire, mentor, and manage a globally distributed team of Site Reliability Engineers.
Own system uptime and SLA compliance across o9’s cloud-native production environment.
Drive root cause analysis and implement post-incident learning processes to improve system resilience.
Oversee the design and implementation of robust monitoring, alerting, and logging solutions.
Lead initiatives to improve infrastructure automation, deployment pipelines, and CI/CD practices.
Champion Infrastructure as Code (IaC) and GitOps best practices.
Manage capacity planning, scalability efforts, and performance tuning across services.

Other

Bachelor’s degree in Computer Science, Engineering, or a related field required; Master’s degree preferred.
8+ years of experience in DevOps, SRE, or infrastructure roles, with 2+ years leading or managing technical teams.
Proven ability to lead technical teams through high-stakes, high-impact situations.
Strong communication skills with the ability to translate complex topics into clear stakeholder updates.
Strategic mindset with a bias for action and problem-solving.