Support a team managing restricted government cloud environments for federal customers by building scalable infrastructure and supporting distributed systems with a focus on reliability, performance, and compliance.
Requirements
- Proven ability to maintain 99.99% uptime in production environments
- 2+ years of experience with Kubernetes, Terraform, Python or Go, and AWS
- 2+ years of experience working with distributed systems
- Experience with FedRAMP compliance and government security requirements
- Track record of implementing secure CI/CD pipelines in restricted or regulated environments
- Familiarity with Redis, Kafka/PubSub, and relational databases
Responsibilities
- Automate and build tools to eliminate repetitive operational tasks and reduce toil
- Maintain and scale reliable software applications using DevOps best practices
- Build and enhance CI/CD pipelines for automated testing, builds, and deployments
- Optimize and maintain Kubernetes-based orchestration systems for performance and reliability
- Troubleshoot complex production issues across application, infrastructure, and distributed system layers
- Participate in on-call rotations and support incident response
- Ensure compliance with government cloud standards across applications and infrastructure
Other
- Collaborate with stakeholders and product teams on infrastructure and deployment requirements
- Strong collaboration and communication skills across cross-functional teams and divisions
- Ability to ramp up quickly and contribute in complex, large-scale environments
- Demonstrated leadership in incident management and operational reliability
- Experience in fast-paced or startup-like environments