Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

Site Reliability Engineer

Karsun Solutions

Salary not specified

Sep 15, 2025

Reston, VA, USA

Karsun is looking to build out and run production environments, automate operations, and maintain and support infrastructure to meet reliability expectations of multiple applications

Requirements

Deep understanding of cloud computing platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Kubernetes)
Experience with monitoring, logging, and observability tools like DataDog, AWS Cloudwatch, ELK, Prometheus, Splunk etc.
Knowledge of infrastructure as code tools (e.g., Terraform, Ansible, ArgoCD) and CI/CD pipelines
Experience deploying enterprise software within AWS Services such as EKS, RDS, EC2, Elastic Load Balancers, Lambda, DynamoDB, multi regions, and API Gateway
Certifications such as AWS Certified DevOps Engineer or Google Professional Cloud DevOps Engineer are a plus
Experience with tools such as Jenkins, GitHub/Bitbucket, Nexus/Artifactory
Experience with tools such as Ansible, Packer, Puppet, or Chef

Responsibilities

Deploy and manage applications into Kubernetes container platforms such as AWS EKS, or OpenShift
Monitor systems and applications, proactively identifying and resolving any performance bottlenecks or availability issues.
Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance.
Implement and support integrated CI/CD pipelines for on-premises and/or cloud assets using tools such as Jenkins, GitHub/Bitbucket, Nexus/Artifactory
Conduct post-incident analyses to identify root causes and implement preventive measures to avoid future incidents
Implement, deploy and maintain infrastructure as code (IaC) for provisioning infrastructure using AWS CloudFormation or Terraform
Design, build, and maintain automated monitoring and notification services to support fault tolerant and highly available systems and metrics using tools such as AWS CloudWatch, EFK, and Prometheus

Other

Bachelor’s degree in computer science, Engineering, or a related field and 8-10 years of relevant experience
Ability to obtain and maintain a Public Trust clearance
Strong problem-solving and analytical skills, with a keen attention to detail
5+ years of experience supporting operations and maintenance for cloud-native applications in production that are fault-tolerant, self-healing, scalable and high available
Travel requirements not mentioned