Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

HEXAWARE Logo

Site Reliability Engineering Lead - SRE - Observability

HEXAWARE

Salary not specified
Sep 27, 2025
Fort Mill, SC, USA
Apply Now

Hexaware is seeking an experienced Site Reliability Engineering (SRE) Lead to drive reliability, scalability, and observability across their services

Requirements

  • Proven experience in SRE/DevOps roles with responsibility for production reliability and observability
  • Prior experience leading or mentoring engineering teams
  • Strong Python experience, particularly for server-side code, automation, and operational tooling
  • Hands-on expertise with Datadog: metrics, APM/tracing, logs, synthetics, dashboards, and alerting
  • Deep understanding of observability concepts and best practices (SLIs/SLOs, tracing, contextual logging)
  • Solid experience with container platforms and orchestration (Docker, Kubernetes)
  • Experience with CI/CD systems and pipelines (e.g., GitHub Actions, Jenkins, CircleCI, GitLab CI)

Responsibilities

  • Lead the SRE function: set technical direction, define best practices, and coach engineers on reliability and operational excellence
  • Establish and maintain SLOs/SLIs, alerting policies, and error budgets in partnership with product and engineering teams
  • Design, implement, and improve observability: metrics, traces, logs, dashboards, and runbooks (Datadog as primary tool)
  • Automate operations to reduce toil: CI/CD pipelines, automated rollouts, self-healing mechanisms, and runbook automation
  • Own incident management: lead incident response, coordinate cross-team communications, drive blameless postmortems and remediation
  • Drive capacity planning, performance tuning, and disaster recovery planning for Python server applications and services
  • Manage tooling and infrastructure: container orchestration, infrastructure-as-code, secrets management, and monitoring integrations

Other

  • Degree in Computer Science, Engineering, or equivalent practical experience
  • Typically 5+ years in SRE/DevOps roles and 2+ years in a lead or senior position (flexible for exceptional candidates)
  • Excellent communication skills and the ability to influence cross-functional teams
  • Ability to work in a hybrid environment (2-3 days onsite in a week)
  • Equal Opportunities Employer: Hexaware Technologies is an equal opportunity employer