Yahoo is looking to build the industry leading next generation AI-powered cloud platforms, services and tools that underpin their products. They need to build and maintain systems at scale to ensure they can serve hundreds of millions of users and billions of daily pageviews.
Requirements
- Experience working with AWS or GCP public cloud
- Strong knowledge of cloud fundamentals including automation workflows, application deployments, monitoring, networking, and identity/access management
- Strong proficiency with Kubernetes (cluster management, operators, CRDs, Helm, GitOps)
- Knowledge of observability tooling (Prometheus, Grafana, OpenTelemetry) for collecting data to feed AI models
- Expertise in any one of the scripting languages
- Skilled in identifying performance bottlenecks, identifying anomalous system behavior, and determining the root cause of incidents in cloud
- Knowledge of Service mesh technologies such as Istio is preferred
Responsibilities
- Develop and enhance Terraform modules to standardize cloud features in the Yahoo environment
- Configure and automate Helm deployments for services hosted in Kubernetes
- Design and deploy AI-driven troubleshooting systems and cost optimization features across all managed Kubernetes clusters.
- Create intelligent onboarding agents that guide developers to configure and deploy applications seamlessly on cloud
- Automate setup, security, and scaling best practices through AI-driven workflows
- Diagnose, resolve and prevent issues from happening in production application infrastructure, using your critical knowledge of Kubernetes, UNIX tools, and cloud technologies
- Implement, and maintain monitoring to provide complete transparency to application and system state, history, and trends
Other
- 3+ years with BS/MS in Computer Science or equivalent
- flexible hybrid work options
- exercising sound judgment
- working effectively, safely and inclusively with others
- exhibiting trustworthiness and meeting expectations