Cloudera is looking for a Principal Engineer to enhance their Cloudera distribution of Apache Spark, focusing on designing and implementing resilient, high-performance solutions for managing petabytes of data across thousands of nodes, thereby improving the scalability, fault tolerance, and overall performance of their Data Engineering platform.
Requirements
- Extensive experience in systems design and development for large-scale distributed environments
- Strong proficiency in Java and Scala programming languages
- Outstanding problem-solving skills, with the ability to research and resolve issues independently
Responsibilities
- Architect, design, and implement scalable, fault-tolerant distributed data processing solutions for large-scale environments
- Take ownership of critical components related to network communication, concurrency, data consistency, and system reliability across clusters of thousands of nodes
- Develop advanced monitoring, debugging, and performance analysis tools to optimize distributed systems
- Act as a tech lead for Cloudera’s Spark team, guiding development and best practices
- Contribute to and integrate with open source technologies such as Apache Spark, Iceberg, and Parquet
- Develop new features using Scala and Java on modern platforms
- Deepen understanding of Cloudera’s Data Engineering stack, focusing on Iceberg and Spark components
Other
- Bachelor’s degree in Computer Science or related field with 10+ years of experience, or Master’s degree with 6+ years, or PhD with 4+ years
- Proven track record of leading complex product enhancements and delivering robust solutions
- Excellent communication skills, both oral and written
- Passion for clean code, attention to detail, and a focus on quality
- Open-minded attitude with a desire to learn and innovate