Starbucks is seeking an Engineering Manager to lead the development and evolution of their AI and Data Platforms, serving as the foundation for multiple data, data science, and AI teams to build data-driven and AI-powered products, ensuring scalability, security, and reliability of the platform.
Requirements
- 8+ years of building scalable services on top of public cloud infrastructure, preferably Azure.
- 8+ years’ experience designing, building and operating large-scale distributed systems and infrastructure
- 5+ years’ experience with data and AI platforms (e.g. Databricks, Azure, Snowflake)
- Deep knowledge of containerization & orchestration (Kubernetes, Docker), IaC and CI/CD technologies.
- Experience working with AI and Machine Learning frameworks (e.g. LangChain, LangGraph, Semantic Kernel, TensorFlow, PyTorch), and APIs
- Proficiency with at least 1 scripting language (e.g. Python, Powershell, Go)
- Proficiency in RAG pipelines, multi-agent orchestration, and tool use is critical.
Responsibilities
- Develop technology vision and roadmap for data, ML and agentic AI platform focusing on security, scalability and reliability.
- Lead discussions with business stakeholders and craft solutions/improvements to advance both technical strategy and business capabilities.
- Oversee the design, implementation, and management of core infrastructure components (cloud, networking, storage, compute) to ensure they are scalable, reliable, secure, and cost-effective for data, ML & AI applications.
- Implement foundational data infrastructure and tools required for large scale data processing including orchestration, storage, compute, and access management.
- Drive the design and development of agentic frameworks, orchestration layers, and AI-driven product capabilities, leveraging LLMs, intelligent automation, and adaptive workflows, including prompt management, performance benchmarking etc.
- Establish and use key metrics (e.g., DORA metrics, SLOs, SLIs, error budgets) to monitor system health and drive continuous improvement in operational practices and incident management.
- Implement and manage monitoring, logging, and alerting to ensure the platform's health and performance.
Other
- Bachelor’s degree in computer science or information systems or equivalent experience.
- Minimum 10 years of technology related work experience
- Minimum 3 years managing a team of 5+ engineers
- Strong communication and collaboration skills with cross-functional partners, including those with and without technical backgrounds.
- Growth-minded, solution-oriented approach with a proven track record of driving projects from concept to impact.