The company is looking to design and implement a robust, scalable, and reusable data ingestion framework using Microsoft Fabric to solve its data architecture and ingestion challenges.
Requirements
- In-depth knowledge and hands-on experience with Microsoft Fabric components, including Lakehouse, One Lake, Delta Lake format, Fabric Pipelines, Dataflows Gen2, and Spark notebooks/Synapse Data Engineering
- Strong understanding of data lake concepts, including optimal use of One Lake for data storage, Delta Lake
- Experience in designing and implementing metadata-driven solutions for data ingestion
- Expertise in various data ingestion patterns, including batch processing (specifically for CSV files)
- Proficient in data modeling standards, potentially including Medallion architecture (Bronze, Silver, Gold layers)
- Strong proficiency in PySpark for data processing and transformation within Fabric Spark environments
- Advanced SQL skills for querying data, managing database interactions
Responsibilities
- Lead the design and implementation of a robust, scalable, and reusable data ingestion framework using Microsoft Fabric
- Building a metadata-driven solution for ingesting data from CSV sources into the Data Lake
- Work closely with stakeholders to understand data sources, ensure compliance, and guide development teams to deliver efficient and adaptable data ingestion pipelines.
Other
- 8-10 Yrs of Experience in Building data architecture using various data engineering techniques
- Strong analytical and problem-solving skills to assess data requirements, troubleshooting issues
- Understanding of data governance principles, data security measures
- Knowledge of monitoring data pipeline performance, troubleshooting issues
- Work closely with stakeholders to understand data sources, ensure compliance, and guide development teams