Allata is seeking a Lead Data Engineer Databricks to contribute to transformative enterprise data platform projects focused on developing data pipelines and logic engines to manage ingest, staging, and multi-tier data product modeling.
Requirements
- Current knowledge of an using modern data tools like (Databricks,FiveTran, Data Fabric and others); Core experience with data architecture, data integrations, data warehousing, and ETL/ELT processes
- Applied experience with developing and deploying custom whl and or in session notebook scripts for custom execution across parallel executor and worker nodes
- Applied experience in SQL, Stored Procedures, and Pysparkbased on area of data platform specialization.
- Strong knowledge of cloud and hybrid relational database systems, such as MS SQL Server, PostgresSQL, Oracle, Azure SQL, AWS RDS, Auroraor a comparable engine.
- Strong experience with batch and streaming data processing techniques and file compactization strategies.
- Automation experience with CICD pipelines to support deployment and integration workflows including trunk-based development using automation services such as Azure DevOps, Jenkins, Octopus.
- Advanced proficiency in Pyspark for advanced data processing tasks.
Responsibilities
- Collaborate in defining the overall architecture of the solution. This includes knowledge of modern Enterprise Data Warehouse and Data Lakehouse architectures that implement Medallion or Lamda architectures
- Design, develop, test, and deploy processing modules to implement data-driven rules using SQL, Stored Procedures, and Pyspark.
- Understand and own data product engineering deliverables relative to a CI-CD pipeline and standard devops practices and principles
- Build and optimize data pipelines on platforms like Databricks, SQL Server, or Azure Data Fabric.
- Develop data pipelines and logic engines to manage ingest, staging, and multi-tier data product modeling.
- Perform data enrichment using various OEM-specific data warehouse and data lake house platform implementations for consumption via analytics clients.
- Design, build, deploy and optimize data products for multiple large enterprise industry vertical-specific implementations by processing datasets through a defined series of logically conformed layers, models, and views.
Other
- Ability to identify, troubleshoot, and resolve complex data issues effectively.
- Strong teamwork, communication skills and intellectual curiosity to work collaboratively and effectively with cross-functional teams.
- Commitment to delivering high-quality, accurate, and reliable data products solutions.
- Willingness to embrace new tools, technologies, and methodologies.
- Innovative thinker with a proactive approach to overcoming challenges.