LendingClub is looking for a Sr Data Engineer to design, build, and optimize data systems that enable batch processing, real-time streaming, pipeline orchestration, data lake management, and data cataloging to help LendingClub confidently leverage data at petabyte scale.
Requirements
- 6+ years of hands on experience with distributed data systems including Hadoop, Spark, Hive, Kafka, DBT, and Airflow/Dagster
- 4+ years of production-quality coding experience building data pipelines in Python
- Experience with public cloud platforms (AWS preferred)
- Proficiency with Databricks and/or Snowflake
- Skilled in Git, JIRA, Jenkins, and shell scripting
- Familiarity with Agile methodologies, test-driven development, and automated testing
- Working knowledge of open-source ML frameworks and the end-to-end model development lifecycle
Responsibilities
- Build systems, core libraries, and frameworks that power batch and streaming Data and ML applications
- Work with modern data technologies such as Hadoop, Spark, DBT, Dagster/Airflow, Atlan, Trino, and platforms like Databricks, Snowflake, and AWS
- Build data pipelines that transform raw data into canonical schemas representing key business entities and publish them to the Data Lake
- Identify, design, and implement internal process improvements, including automation, performance optimization, and cloud cost reduction
- Implement processes and systems for data quality, observability, governance, and lineage
- Support production operations, troubleshoot and resolve issues, and conduct root-cause analysis to enhance system resiliency
- Write unit and integration tests, follow test-driven development practices, and contribute to documentation and engineering wikis
Other
- Develop a deep understanding of LendingClub’s data — what it represents and how it drives our products and business decisions
- Collaborate with Business, Product, Program, and Engineering teams to deliver high-quality, reliable data efficiently and cost-effectively
- Strong collaboration and communication skills, with a problem-solving mindset and empathy for partners and teammates
- Commitment to simplicity, quality, and building fast, reliable, high-impact data pipelines
- Experience running containers (Docker/LXC) in production using orchestration services such as Kubernetes, Docker Swarm, AWS ECS, or AWS EKS