Bot Auto is seeking to enhance the quality of life for communities around the globe by revolutionizing the transportation of goods with cutting-edge autonomous trucks, and is looking for a Senior Software Engineer, Infrastructure to architect, develop, and scale their next-generation infrastructure.
Requirements
- Deep understanding of operating systems, storage and network architectures.
- Strong software development skills in one or more languages: Python, Go, Java, JavaScript/TypeScript
- Expertise in infrastructure automation, observability, platform scaling, and cost-efficiency strategies
- In-depth experience with Kubernetes internals, including at least one of EKS, GKE, AKS, RKE2/Rancher
- Hands-on background with AWS, GCP, or Azure
- IaC tools proficiency, including Terraform, Pulumi, OpenTofu, or Ansible
- Experience with API Gateway platforms, including Kong or APISIX
Responsibilities
- Establish the core infrastructure across public cloud and on-prem data center, with highly available compute, storage and networking resources to satisfy versatile workload needs.
- Architect and maintain robust IaC solutions across bare-metal on-premise environments, cloud infrastructure, Kubernetes clusters, and application deployments.
- Lead the design, development, and operation of Kubernetes-based platforms, tackling dynamic scheduling, cross-cluster orchestration, autoscaling, and custom platform tooling.
- Build and manage core workflow platforms supporting data pipelines, simulation jobs, ML model pipelines, CI/CD and other general-purpose workflows.
- Develop and operate API gateways that facilitate secure, scalable, and reliable access to internal and external services.
- Design and support company-wide event-driven architecture for various protocols, with messaging and streaming platforms
- Architect and manage end-to-end observability solutions, covering metrics, tracing, and logging.
Other
- Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience
- 5+ years of hands-on experience in infrastructure, platform engineering, DevOps or SRE roles
- Handle user impact issues in a timely manner with proper communications.
- Mitigate issues in the short term, and follow up with long term solutions.
- Demonstrated ability to monitor and reduce cloud/infrastructure spend without compromising performance