Toyota is looking to drive its Kubernetes microservices and containerization strategy to ensure platform resilience, optimize resource utilization, and enable seamless disaster recovery and business continuity.
Requirements
- 7+ years of hands-on experience managing Kubernetes clusters, container orchestration, and microservices deployment in high-performance environments
- Proven expertise with DevOps automation tools such as GitHub Actions, Terraform, Ansible, Helm, Rancher, and Harness
- Strong scripting skills in Python or similar languages to build testable automation solutions
- Deep understanding of monitoring and logging frameworks including Datadog, Splunk, and Prometheus
- Advanced experience deploying and managing distributed messaging systems like Kafka, RabbitMQ, MQTT, or Amazon Kinesis
- Experience with hybrid cloud/on-premises infrastructure, including VMware and AWS services
- Familiarity with business process mining tools (Celonis, SAP Signavio, UIPath) and project management platforms (JIRA, MS Project)
Responsibilities
- Own the end-to-end management of Kubernetes clusters across on-premises and cloud environments, ensuring high availability and performance
- Design, deploy, and maintain scalable microservices using Helm charts, GitOps tools like Argo CD, and CI/CD pipelines built with GitHub Actions and Terraform
- Troubleshoot and resolve complex issues spanning cluster components, networking, storage, and application layers to minimize downtime
- Implement and enforce security best practices to protect our containerized environments and applications
- Monitor system health and resource usage using tools like Datadog, Splunk, and Prometheus, driving continuous performance improvements
- Collaborate closely with infrastructure, networking, security, and application teams to align solutions with business needs and accelerate delivery
- Lead incident response efforts and conduct post-mortem analyses to prevent future disruptions
Other
- Bachelor’s degree or equivalent experience providing a strong foundation in software engineering, systems administration, or related fields
- Excellent analytical, problem-solving, and communication skills, with a collaborative mindset to work effectively across teams
- Experience with incident management platforms and leading cross-functional incident response
- Ability to work in a team environment built on teamwork, flexibility and respect
- Toyota does not offer sponsorship of job applicants for employment-based visas or any other work authorization for this position at this time