Google Cloud Dataproc aims to be the premier cloud platform for running Hadoop and Spark workloads. The role focuses on enhancing performance, reliability, and feature sets for these big data technologies, including LakeHouse components, to meet evolving customer needs and maintain a competitive edge in the cloud data analytics market.
Requirements
- 5 years of experience in programming with Java.
- 3 years of experience designing, analyzing and troubleshooting large-scale distributed systems.
- 3 year of experience in a technical leadership role with software design and architecture; leading technical design and driving team execution.
- Experience developing with Spark, Hive, or with similar processing frameworks.
- Experience developing with Iceberg, Hudi or Delta.
- Experience with Database or Data Warehouse internals.
Responsibilities
- Build high-impact customer-facing features which make Cloud Dataproc the best place to run Hadoop and Spark in the cloud.
- Driving technical design and execution for differentiated Performance and LakeHouse features and enhancements in an ambiguous problem space.
- Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency).
- Enhance Apache Spark for performance, reliability, security, and monitoring, and simultaneously enhance Lake House technologies like Iceberg, Hudi, or Delta Lake for performance, security, and monitoring.
- Contribute to and adapt existing documentation or educational content based on product and program updates, as well as user feedback, while also extending open-source technologies like Apache Spark, Hive, and Trino to improve their debuggability, observability, and supportability.
Other
- Bachelor’s degree or equivalent practical experience.
- The US base salary range for this full-time position is $166,000-$244,000 + bonus + equity + benefits.
- Sunnyvale, CA, USA; Kirkland, WA, USA