FreeWheel, a Comcast company, is looking to improve the quality of service (QoS) in its software engineering department. The goal is to ensure a resilient, reliable, and learning-focused culture by resolving incidents quickly, implementing long-term improvements, and maintaining consistent high-quality service across global operations and diverse customer needs.
Requirements
- 6+ years of technical experience in software engineering, site reliability, or production operations.
- Proven track record of managing the full software development lifecycle (SDLC), from requirements gathering to production release.
- Hands on understanding of full stack components: Frontend/UI frameworks and client experience
- Hands on understanding of full stack components: APIs & service layers
- Hands on understanding of full stack components: Database layer (SQL/NoSQL, data modeling, performance tuning)
- Hands on understanding of full stack components: Backend servers and distributed systems
- Hands on understanding of full stack components: Big data & ETL pipelines (batch and streaming)
- Strong knowledge of incident management (PagerDuty, Jira, Datadog, Splunk, ServiceNow).
Responsibilities
- Own the Escalations lifecycle within Engineering, from the beginning through resolution.
- Lead root cause analysis (RCA) sessions that dig deeper than symptoms and deliver long-lasting fixes.
- Facilitate retrospectives and follow-ups, turning lessons learned into clear improvement plans.
- Define and track metrics (incident frequency, resolution times, client impact), and make them visible through dashboards and reports.
- Partner with teams to strengthen systems through tooling, automation, and platform hardening.
- Keep a cross-platform perspective (TV, Data, Beeswax, Strata) to spot patterns and systemic issues.
- Collaborate with Engineering (Tier 2/3) to resolve incidents quickly and share learnings across teams.
Other
- Combine deep technical expertise with strong collaboration and communication skills.
- Split your time between technical ownership and cross-functional collaboration.
- Act as the single voice for Engineering in incident management, making sure communication is consistent and clear at all levels.
- Partner with Operations (Tier 1) to fine-tune escalation paths and help reduce unnecessary hand-offs.
- Work closely with the COO team to analyze client impact and provide crisp, timely updates during incidents.
- Confidence to dive deep with engineers while also translating technical details into clear business context for executives and clients.
- Experience operating in global, multi-time-zone environments with diverse customer and platform needs.