Zendesk is looking for a Senior Staff ML Engineer to set the technical vision and architecture for Zendesk’s next generation of GenAI infrastructure and platform, ensuring they are secure, scalable, performant, cost-efficient, and future-proof.
Requirements
- 10+ years in ML/AI engineering, including at least 3 years in staff/principal-level platform or infrastructure leadership roles.
- Expertise in LLM systems, multi-model orchestration, and GenAI infrastructure patterns
- Proven ability to design complex, distributed systems with high reliability, security, and scalability requirements
- Strong experience with AWS, GCP, or Azure; Kubernetes; Docker; and distributed event-driven architectures
- Fluency in Python and at least one other server-side language (Java, Scala, Golang, or Ruby).
- Background in agentic architectures and complex workflow orchestration for AI agents.
- Experience and contributions to enterprise-scale ML platforms.
Responsibilities
- Own the end-to-end architecture for Zendesk’s GenAI platform, ensuring alignment with business goals and technical best practices.
- Set technical direction for core systems including LLM Proxy, agent orchestration layers, evaluation and benchmarking pipelines, and model observability tooling
- Design and implement model routing, fallback strategies, and A/B testing infrastructure for LLMs from multiple vendors
- Establish engineering standards for safety, latency, cost attribution, and reliability across all GenAI services.
- Coach Staff and Senior Engineers, providing deep technical guidance and fostering a culture of technical excellence.
Other
- Full ownership of the projects you work on.
- What you will be doing will have a huge impact.
- Team of passionate people who love what they do.
- Exciting projects, ability to implement your own ideas and improvements.
- Opportunity to learn and grow.