Job Board
LogoLogo

Get Jobs Tailored to Your Resume

Filtr uses AI to scan 1000+ jobs and finds postings that perfectly matches your resume

CACI Logo

Cloud Reliability Engineer - Platform Infrastructure

CACI

$69,100 - $141,500
Sep 23, 2025
Hampton, VA, USA • San Antonio, TX, USA • Omaha, NE, USA • Rome, NY, USA
Apply Now

CACI is looking for a Cloud Reliability Engineer to drive the design, build, and support of cloud-native infrastructure and platform services for critical Department of Defense mission systems, focusing on owning the reliability of the product and directly impacting user outcomes.

Requirements

  • Deep expertise with AWS (GovCloud/SC2S), Kubernetes (EKS or self-managed), Linux, and CI/CD tools.
  • Proficiency in Bash or Python.
  • Hands-on experience with Git/GitLab, container registries, and infrastructure monitoring.
  • Strong understanding of cloud security, IAM, networking, and platform lifecycle management.
  • Proven ability to translate user needs into measurable reliability targets and implement user-focused SLOs.
  • Solid grasp of cloud networking, load balancing, and DNS.
  • Certifications: CompTIA Cloud+ or Security+; GICSP, SSCP, or GSEC.

Responsibilities

  • Engineer Cloud-Native Platforms: Design, deploy, and maintain robust Kubernetes clusters and supporting services across AWS GovCloud and Azure.
  • Drive User-Centric Reliability: Collaborate to understand Critical User Journeys (CUJs). Define and implement product-level Service Level Objectives (SLOs), focusing on user-visible behaviors and outcomes (availability, latency, etc.).
  • Automate Everything: Provision, configure, and monitor platforms using Infrastructure-as-Code (Terraform, CloudFormation).
  • Enhance Observability: Implement and leverage telemetry and request-level annotation to directly link infrastructure requests to product functionality and mission partner objectives.
  • Secure & Comply: Manage identity, access, patching, logging, and backups in multi-tenant environments, integrating RMF, Zero Trust, and IL5+ hardening into platform design.
  • Troubleshoot with Impact: Prioritize and resolve platform service and infrastructure issues based on user impact and product criticality.
  • Collaborate & Document: Work within Agile teams, contribute to user objective refinement, and maintain comprehensive system documentation.

Other

  • Active TS/SCI Clearance.
  • Bachelor’s degree in a technical field with 3+ years of relevant experience.
  • Excellent communication and troubleshooting skills, with a focus on end-user experience and product reliability.
  • Experience with Air Force or DoD platform infrastructure environments (e.g., Platform One, Iron Bank, Big Bang).
  • Familiarity with Atlassian tools and DevSecOps workflows.