Databricks’ Model Serving product provides enterprises with a unified, scalable, and governed platform to deploy and manage AI/ML models — from traditional ML to fine-tuned and proprietary large language models. It offers real-time, low-latency inference, governance, monitoring, and lineage. As AI adoption accelerates, Model Serving is a core pillar of the Databricks platform, enabling customers to operationalize models at scale with strong SLAs and cost efficiency.
Requirements
- 10+ years of experience building and operating large-scale distributed systems.
- Deep expertise in model serving, inference systems, and related infrastructure (e.g., routing, scheduling, autoscaling, and observability).
- Strong foundation in algorithms, data structures, and system design as applied to large-scale, low-latency serving systems.
- Proven ability to deliver technically complex, high-impact initiatives that create measurable customer or business value.
- Experience leading architecture for large-scale, performance-sensitive CPU/GPU inference systems.
Responsibilities
- Design and implement core systems and APIs that power Databricks Model Serving, ensuring scalability, reliability, and operational excellence.
- Partner with product and engineering leadership to define the technical roadmap and long-term architecture for serving workloads.
- Drive architectural decisions and trade-offs to optimize performance, throughput, autoscaling, and operational efficiency for CPU and GPU serving workloads.
- Contribute directly to key components across the serving infrastructure — from model container builds and deployment workflows to runtime systems like routing, caching, observability, and intelligent autoscaling — ensuring smooth and efficient operations at scale.
- Collaborate cross-functionally with product, platform, and research teams to translate customer needs into reliable and performant systems.
- Lead technical initiatives that improve latency, availability, and cost-effectiveness across both customer-facing and foundational serving layers.
- Establish best practices for code quality, testing, and operational readiness, and mentor other engineers through design reviews and technical guidance.
Other
- Strong communication skills and ability to collaborate across teams in fast-moving environments.
- Strategic and product-oriented mindset with the ability to align technical execution with long-term vision.
- Passion for mentoring, growing engineers, and fostering technical excellence.
- If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.