Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Gallagher Logo

Monitoring Engineer

Gallagher

Salary not specified
Sep 28, 2025
Rolling Meadows, IL, US
Apply Now

Gallagher is seeking a Manager of Observability Engineering to build and maintain robust monitoring systems, ensuring the reliability, performance, and scalability of their infrastructure by developing cutting-edge observability practices across cloud-native and on-premises environments.

Requirements

  • Hands-on experience with observability tools such as Nagios, Grafana, the ELK stack (Elasticsearch, Logstash, Kibana), Splunk, Dynatrace, SolarWinds, or equivalent platforms.
  • Proficiency in scripting or programming languages like Python, Perl, or Bash for automation and tool development.
  • Familiarity with cloud environments (AWS, Azure, Google Cloud) and container orchestration tools (Docker, Kubernetes).
  • Experience with infrastructure-as-code tools (Terraform, Ansible) and CI/CD pipeline management.
  • Strong analytical and problem-solving capabilities with a proactive approach to troubleshooting and resolving issues.
  • A solid understanding of security best practices, monitoring network/system security, and knowledge of performance optimization techniques and tools.
  • Proven ability to design effective alerting systems and manage clear escalation processes.

Responsibilities

  • Design and Implement Observability Solutions: Develop and maintain comprehensive monitoring, logging, tracing, and alerting systems that cover both cloud-native and on-premises environments.
  • Collaborate with Cross-Functional Teams: Work closely with development, DevOps, and SRE teams to integrate observability best practices throughout all phases of the software development and operations lifecycle.
  • Create Effective Dashboards and Visualizations: Build and optimize intuitive dashboards that offer actionable insights into system performance, health, and key metrics, ensuring that teams have access to real-time data.
  • Develop Automated Alerts and Incident Management Workflows: Implement automated alerting mechanisms and incident response processes designed to detect issues early and resolve them proactively, ultimately minimizing customer impact.
  • Optimize Observability Platforms: Continually evaluate and refine observability tools and platforms to ensure they remain efficient, scalable, and user-friendly.
  • Integrate with Various Observability Tools: Manage integrations with popular observability platforms—such as Dynatrace, Grafana, the ELK Stack, Splunk, Datadog, and New Relic—to create a cohesive, unified monitoring strategy.
  • Lead Root Cause Analysis and Post-Incident Reviews: Facilitate in-depth investigations after incidents, conduct thorough root cause analyses, and recommend actionable improvements to prevent future recurrences.

Other

  • Education: Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • Experience: A minimum of 3+ years in observability engineering or a similar role, demonstrating a strong grasp of observability concepts including monitoring, logging, tracing, and alerting.
  • Communication & Collaboration: Excellent interpersonal skills with the ability to communicate effectively and collaborate with cross-functional teams.
  • THIS ROLE WILL BE BASED OUT OF COLOMBIA.
  • Advocate for Observability and Reliability: Champion a company-wide culture that prioritizes system observability, reliability, and performance, ensuring these principles are embedded in every project and initiative.