Google is looking to make GKE the leading cost-effective, simplified, and fastest platform for running Generative AI (GenAI) inference workloads, addressing the scaling and usability challenges presented by large GenAI models with accelerators compared to traditional CPU workloads.
Requirements
- 5 years of experience building and developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage, or hardware architecture.
- 5 years of technical expertise building in AI/ML infrastructure.
- 8 years of experience with data structures/algorithms.
- 3 years of experience with AI/ML inference stack.
Responsibilities
- Define and pioneer initiatives to deliver the most efficient and cost-effective AI Inference workloads for GKE, translating customer needs and engaged analysis into multi-quarter technical roadmaps.
- Author, drive consensus, and finalize detailed system designs for large-scale, cross-team Inference projects, and guide/review the designs of other senior engineers for architectural robustness and scalability.
- Uncover, scope, and prioritize significant areas of technical debt across the Inference or core GKE systems. Develop strategies and delegate the execution of paying off this debt.
- Serve as the primary reviewer and technical authority for critical components, establishing and enforcing best practices that set the standard for the entire team and partner teams.
- Design and oversee the development of advanced test, monitoring, and scalable benchmarking infrastructure to prevent future needs and bottlenecks.
Other
- Seattle, WA, USA; Kirkland, WA, USA
- 3 years of experience in a technical leadership role leading project teams and setting technical direction.
- 3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
- Versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack.