Roblox is experiencing rapid growth in daily active users and needs to build a data platform that can support tens of millions of monthly players, powering features across the company and enabling efficient and accurate product analytics.
Requirements
- 3-5 years of experience in building and operating large‑scale data‑lake on AWS and/or on-prem with relevant technology such as Hive, Kubernetes, Iceberg, Flink.
- Solid understanding of Spark, and ability to write, debug and optimize Spark or PySpark code.
- Proficient in one of these languages: Python or Java
- Deep understanding in at least one of the data domains: ETL/ELT, Storage and Cost, Spark, Discovery and Metadata, Quality/O11y, Governance.
Responsibilities
- Push our data platform to be robust and opinionated in domains such as data discovery and governance, metadata management, managed ETL and streaming, Spark optimization, and more.
- Partner with ML, Data Engineering, Data Science and Product teams to support their use cases.
- Have the opportunity to build the next-gen metrics platform that enables efficient and accurate product analytics across the company.
- Be part of a highly collaborative and dynamic team where we value ownership and opinions.
- Directly influence and shape the data culture and practices across the entire company, building from scratch and see impact in days/weeks not months years
- Stay current with emerging technologies and push innovation in data across Roblox.
Other
- A graduate degree or equivalent experience in Computer Science, Engineering, or a related technical field is a plus.
- Roles that are based in an office are onsite Tuesday, Wednesday, and Thursday, with optional presence on Monday and Friday (unless otherwise noted).