The company is looking for a Data Engineer with Databricks expertise to design, build, and optimize scalable data pipelines and lakehouse architectures to support analytics, reporting, and machine learning across the organization.
Requirements
- Strong hands-on experience with Databricks*, including Delta Lake, Spark SQL
- Proficiency in Python and SQL* for data manipulation and pipeline development
- Solid understanding of Apache Spark* internals and performance tuning
- Experience with cloud platforms* (Azure, AWS,GCP)
- Knowledge of data modeling, partitioning, and lakehouse principles
- Ability to work with large-scale datasets and optimize storage and compute costs
- Exposure to data governance* frameworks and tools (e.g., Unity Catalog, Purview)
Responsibilities
- Develop and maintain robust ETL/ELT pipelines using Databricks and Apache Spark
- Design and implement Delta Lake architectures for structured and semi-structured data
- Optimize performance of Spark jobs and manage cluster resources efficiently
- Automate workflows using Databricks Jobs, Workflows
- Ensure data quality, lineage, and governance using Unity Catalog and monitoring tools
- Document data models, pipeline logic, and architectural decisions
- Participate in code reviews and contribute to engineering best practices
Other
- Collaborate with data analysts, scientists, and product teams to deliver clean, reliable datasets
- Strong communication skills and ability to collaborate across teams
- 4+ years of experience as a Data Engineer or in a similar role