Lead platform engineering initiatives focused on Kubernetes-based infrastructure and AI coding tool integration, drive the design and development of a custom Kubernetes development platform, build critical tooling for scalable deployment, and contribute to the evolution of AI agent capabilities.
Requirements
- Proficiency in Python, Go, and Node.js.
- Deep experience with Kubernetes or similar technologies (e.g., Docker Compose, Docker Swarm, AWS ECS).
- Strong understanding of container image development using Docker or Podman.
- Experience with GitHub Actions for CI/CD workflows.
- Familiarity with observability tools and concepts, especially Prometheus and OpenTelemetry.
- Experience implementing feature flags in distributed systems.
- Hands-on experience with AI coding tools and agent tool development.
Responsibilities
- Lead design and architecture efforts in collaboration with client leads.
- Develop a custom-built Kubernetes development platform, including writing and reviewing technical design documents.
- Build and maintain a Kubernetes Operator to: Orchestrate upgrades and workload clusters.
- Build and maintain a Kubernetes Operator to: Automate deployment of new workload clusters.
- Support internal development teams by resolving platform-related inquiries and guiding best practices.
- Champion platform engineering best practices including CI/CD, observability, and infrastructure as code.
- Read and debug unfamiliar codebases and systems with minimal guidance.
Other
- Mentor junior engineers and foster a culture of technical excellence and continuous learning.
- 100% remote primarily supporting Eastern work hours.
- Exceptional reading comprehension and ability to work outside your comfort zone:
- Navigating unfamiliar codebases.
- Debugging complex systems without prior exposure.