Riot Games is seeking to evolve their next-generation ML Platform to enable teams to seamlessly productionize and operate machine learning models at global scale.
Requirements
- 6+ years of experience in software engineering, with substantial time spent in platform or infrastructure teams
- Proven technical leadership in building large scale distributed systems, production ML systems or model serving infrastructure at scale
- Deep experience with cloud-native systems (e.g., Kubernetes, containerization, autoscaling, observability stacks)
- Experience with one or more inference serving frameworks (e.g., NVIDIA Triton, KServe, TorchServe, BentoML, Seldon Core etc)
- Familiarity with GPU orchestration, performance tuning, and cost-aware scheduling
- Strong background in CI/CD automation, IaC tools (e.g., Terraform), and artifact management
- Hands-on experience with Python ML ecosystems, package management (Poetry, Conda etc), and vulnerability scanning
Responsibilities
- Design and implement ML inference infrastructure, supporting both real-time and nearline batch use cases, with CPU/GPU-aware orchestration and automated deployment pipelines for scalable model serving.
- Partner with researchers, game teams, and platform engineers to understand product needs and deliver generalizable, reusable solutions.
- Define and build CI/CD workflows for ML artifacts, supporting rapid iteration and safe promotion from dev to production and MLOps practices.
- Develop tooling for environment and dependency management (e.g., Conda/Poetry lock files, secure image builds) to ensure reliable, reproducible ML runtimes.
- Implement platform observability features such as monitoring, drift detection, resource utilization, and latency tracking.
- Establish patterns and tooling for multi-version model support, blue/green and shadow deployments, and rollback.
- Be thoughtful on developer UX and incorporate an iterative approach to improving.
Other
- 6+ years of experience in software engineering
- Ability to mentor engineers, write clear documentation, and influence cross-functional stakeholders
- Passion for player experience, game systems, or creative technology development
- Open paid time off policy and other perks such as flexible work schedules
- Medical, dental, and life insurance, parental leave for you, your spouse/domestic partner, and children, and a 401k with company match