Snowflake is looking to solve the problem of enabling enterprises to achieve their full potential by providing a truly open data lake architecture, free from vendor lock-in, through their Polaris project and the evolution of their data lake ecosystem.
Requirements
- 8+ years of experience designing and building scalable, distributed systems.
- Strong programming skills in Java, Scala, or C++ with an emphasis on performance and reliability.
- Deep understanding of distributed transaction processing, concurrency control, and high-performance query engines.
- Experience with open-source data lake formats (e.g., Apache Iceberg, Parquet, Delta) and the challenges associated with multi-engine interoperability.
- Experience building cloud-native services and working with public cloud providers like AWS, Azure, or GCP.
- Familiarity with data governance, security, and access control models in distributed data systems.
- Designing or implementing REST APIs, particularly in the context of distributed systems.
Responsibilities
- Design and implement scalable, distributed systems to enable support for Iceberg metadata management and query engine interoperability.
- Architect and build systems that integrate Snowflake queries with external Iceberg catalogs and various data lake architectures, enabling seamless interoperability across cloud providers.
- Develop high-performance, low-latency solutions for catalog federation, allowing customers to manage and query their data lake assets across multiple catalogs from a single interface.
- Collaborate with Snowflake’s open-source team and the Apache Iceberg community to contribute new features and enhance the Iceberg REST specification.
- Work on core data access control and governance features for Polaris, including fine-grained permissions such as row-level security, column masking, and multi-cloud federated access control.
- Contribute to our managed Polaris service, ensuring that external query engines like Spark and Trino can read from and write to Iceberg tables through Polaris in a way that’s decoupled from Snowflake’s core data platform.
- Build tooling and services that automate data lake table maintenance, including compaction, clustering, and data retention for enhanced query performance and efficiency.
Other
- Every Snowflake employee is expected to follow the company’s confidentiality and security standards for handling sensitive data.
- Snowflake employees must abide by the company’s data security plan as an essential part of their duties.
- It is every employee's duty to keep customer information secure and confidential.
- A passion for open-source software and community engagement, particularly in the data ecosystem.
- Contributing to open-source projects, especially in the data infrastructure space.