Optimum is looking to enhance the reliability, scalability, and operational efficiency of its IP-based network systems by bridging traditional network engineering with modern software-driven reliability practices.
Requirements
- Understanding of core routing protocols (BGP, OSPF, IS-IS, MPLS, RSVP).
- Hands-on experience with backbone and edge routers (Cisco, Juniper, Nokia, Arista).
- Strong familiarity with network automation frameworks (Ansible, Netconf/YANG, Python).
- Skilled in monitoring tools and telemetry platforms (ThousandEyes, Prometheus, SNMP traps, Flow tools).
- Experience in traffic engineering, peering policy management, and failover strategies.
- Strong troubleshooting skills using packet capture, route analysis, and telemetry correlation tools.
- Familiarity with cloud-native networking and SDN/NFV architectures (preferred).
Responsibilities
- Lead a cross-functional team of SREs focused on IP transport, edge, and core networking.
- Define and monitor Service Level Objectives (SLOs), SLAs, and Error Budgets for IP services.
- Oversee complex incident response and postmortem processes for network outages or performance degradations.
- Introduce observability enhancements to monitor routing health, latency, jitter, and packet loss.
- Drive automation for provisioning, configuration validation, and traffic rerouting (BGP, OSPF, MPLS).
- Support infrastructure modernization (IPv6 adoption, segment routing, SDN transitions).
- Maintain compliance with security standards and operational governance across IP infrastructure.
Other
- 3+ years in a technical leadership or people management position.
- Ability to lead distributed teams and drive complex network reliability programs.
- Excellent communication, collaboration, and cross-functional influence skills.
- Strong problem-solving mindset with a focus on reducing MTTR and improving operational KPIs.
- Experience managing on-call rotations and conducting operational readiness reviews.