Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Monitoring Engineer

Gallagher

Salary not specified

Sep 28, 2025

Rolling Meadows, IL, US

Gallagher is seeking a Manager of Observability Engineering to build and maintain robust monitoring systems, ensuring the reliability, performance, and scalability of their infrastructure by developing cutting-edge observability practices across cloud-native and on-premises environments.

Requirements

Hands-on experience with observability tools such as Nagios, Grafana, the ELK stack (Elasticsearch, Logstash, Kibana), Splunk, Dynatrace, SolarWinds, or equivalent platforms.
Proficiency in scripting or programming languages like Python, Perl, or Bash for automation and tool development.
Familiarity with cloud environments (AWS, Azure, Google Cloud) and container orchestration tools (Docker, Kubernetes).
Experience with infrastructure-as-code tools (Terraform, Ansible) and CI/CD pipeline management.
Strong analytical and problem-solving capabilities with a proactive approach to troubleshooting and resolving issues.
A solid understanding of security best practices, monitoring network/system security, and knowledge of performance optimization techniques and tools.
Proven ability to design effective alerting systems and manage clear escalation processes.

Responsibilities

Design and Implement Observability Solutions: Develop and maintain comprehensive monitoring, logging, tracing, and alerting systems that cover both cloud-native and on-premises environments.
Collaborate with Cross-Functional Teams: Work closely with development, DevOps, and SRE teams to integrate observability best practices throughout all phases of the software development and operations lifecycle.
Create Effective Dashboards and Visualizations: Build and optimize intuitive dashboards that offer actionable insights into system performance, health, and key metrics, ensuring that teams have access to real-time data.
Develop Automated Alerts and Incident Management Workflows: Implement automated alerting mechanisms and incident response processes designed to detect issues early and resolve them proactively, ultimately minimizing customer impact.
Optimize Observability Platforms: Continually evaluate and refine observability tools and platforms to ensure they remain efficient, scalable, and user-friendly.
Integrate with Various Observability Tools: Manage integrations with popular observability platforms—such as Dynatrace, Grafana, the ELK Stack, Splunk, Datadog, and New Relic—to create a cohesive, unified monitoring strategy.
Lead Root Cause Analysis and Post-Incident Reviews: Facilitate in-depth investigations after incidents, conduct thorough root cause analyses, and recommend actionable improvements to prevent future recurrences.

Other

Education: Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
Experience: A minimum of 3+ years in observability engineering or a similar role, demonstrating a strong grasp of observability concepts including monitoring, logging, tracing, and alerting.
Communication & Collaboration: Excellent interpersonal skills with the ability to communicate effectively and collaborate with cross-functional teams.
THIS ROLE WILL BE BASED OUT OF COLOMBIA.
Advocate for Observability and Reliability: Champion a company-wide culture that prioritizes system observability, reliability, and performance, ensuring these principles are embedded in every project and initiative.