Roku is looking to solve the problem of delivering outstanding streaming experiences to millions of people around the world by developing and maintaining the core platform that powers search, personalization, and content discovery at scale across every Roku device and service.
Requirements
- 8+ years of professional experience building large-scale distributed systems
- Proficiency in modern backend languages (Java, Python, Go) and scalable cloud-native architectures (AWS, Kubernetes, service meshes, etc.)
- Deep knowledge of DevOps practices, SRE principles, infrastructure-as-code, and real-time data processing
- Experience collaborating with engineers and product stakeholders to shape features and systems
- A platform mindset: you design with reuse, observability, and scale in mind, supporting not only your team but the broader engineering organization
- Expertise in deploying and operating observability and tracing tools such as OpenTelemetry, Grafana Tempo, Thanos, Loki, and Prometheus at scale
Responsibilities
- Design, build, and operate platform infrastructure powering real-time search and personalized recommendations
- Work closely with machine learning engineers, data scientists, and infrastructure teams to scale our Machine Learning Platform
- Shape the roadmap for our next-generation architecture, including improvements in cost efficiency, observability, and resilience
- Drive tooling and standardization, working with the broader platform teams to identify and align on shared approaches
- Design and implement multi-tenant systems and APIs that accelerate development, reduce coupling, and serve multiple teams across Roku
- Take ownership of quality and system performance from design through deployment and operation in production
Other
- Master’s degree in Computer Science, Engineering, or equivalent professional experience
- Collaboration is fundamental to how we work — we invest as much in supporting our growth as we do in building exceptional systems