Optimize observability tools and platforms across Network and IT domains to ensure high availability, performance, and visibility across complex telecommunications infrastructure.
Requirements
In-depth knowledge of observability, monitoring, and alerting architectures across hybrid environments (on-prem/cloud)
Proven experience with observability platforms including Grafana, Splunk, AppDynamics, Kafka, Flink, and NMS tools
Familiarity with scripting, automation, and integration technologies (e.g., Python, REST APIs, Ansible)
Understanding of modern infrastructure (e.g., containers, microservices, cloud-native environments)
Responsibilities
Develop and execute a comprehensive observability roadmap
Lead the deployment, configuration, lifecycle management, and integration of observability tools
Drive real-time visibility, alerting, telemetry, logging, and metrics collection strategies
Champion automation of observability workflows and integration with ITSM/DevOps pipelines
Enable intelligent alerting and anomaly detection to support incident response and problem management processes
Maintain versioning, upgrades, scalability plans, and decommissioning strategies for all tools in the observability stack
Other
10+ years in the Telecommunications or Technology industry with a strong focus on Network and IT Operations, Automation, and Observability
Bachelor’s or Master’s degree in Computer Science, Engineering, Telecommunications, or a related discipline
Strong project and people management capabilities with a track record of driving operational excellence and tool adoption