Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Nexthink Logo

Lead Site Reliability Engineer - Multiple Teams

Nexthink

Salary not specified
Sep 22, 2025
Boston, MA, USA
Apply Now

Nexthink is looking to build and run a high-performance cloud platform, with a specific focus on enabling the US Public Sector market, including a FedRAMP Moderate offering. This involves driving the development of modern, cloud-native SRE processes and managing operations for their multi-tenant, microservices-based cloud platform to meet federal security standards.

Requirements

  • Proficiency in cloud platforms (AWS, Azure, GCP) and cloud-native services.
  • Strong scripting and programming skills (Python, Bash, Go, or similar).
  • Experience with Infrastructure as Code (IaC) tools such as Terraform, CrossPlane, CloudFormation, or Ansible.
  • Knowledge of containerization and orchestration (Docker, Kubernetes).
  • Familiarity with CI/CD pipelines and tools (Jenkins, GitLab, GitHub, etc.).
  • In-depth knowledge of FedRAMP requirements and best practices.
  • Experience with security tools and practices (SIEM, IDS/IPS, firewalls).

Responsibilities

  • Drive automation of infrastructure provisioning, configuration, and management using Infrastructure as Code (IaC) tools.
  • Develop and maintain comprehensive monitoring, logging, and alerting systems to ensure high availability and performance.
  • Lead efforts in performance tuning and optimization for applications and infrastructure.
  • Ensure implementation and maintenance of security controls and best practices to achieve FedRAMP compliance.
  • Conduct and oversee regular security assessments, vulnerability scans, and penetration testing.
  • Lead incident management efforts, ensuring rapid resolution and thorough root cause analysis.
  • Work closely with development, operations, and security teams to integrate reliability and security into the software development lifecycle.

Other

  • Lead, mentor, and develop a team of US-based Site Reliability Engineers.
  • Foster a culture of continuous improvement, collaboration, and innovation.
  • Collaborate with the compliance team to prepare for and respond to FedRAMP audits.
  • Communicate effectively with stakeholders, providing regular updates on system performance, reliability, and compliance status.
  • Ability to collaborate with and foster effective communication with global engineering teams in EU and India timezones.