The company is looking to build and operate data platforms, pipelines, and services that power analytics and decision-making across the enterprise, focusing on designing and maintaining robust, scalable data integration and transformation processes, upholding data quality and governance standards.
Requirements
- Proficiency with distributed data processing/orchestration (e.g., Apache Spark, Airflow, Kafka) to build scalable pipelines and streaming/batch workloads.
- Strong programming skills in Python and/or Java; expert SQL for transformation and performance-minded querying.
- Experience designing and deploying solutions on modern cloud data platforms, especially Azure (Data Factory, Synapse, Databricks, ADLS); exposure to Snowflake is a plus.
- Knowledge of lakehouse/warehouse concepts (e.g., medallion layering, dimensional modeling, partitioning); experience with relational and NoSQL stores.
- Implement data quality checks, schema enforcement, and lineage; align with stewardship, cataloging, and compliance standards (e.g., SOX) in partnership with IT and Security.
- Microsoft Certified: Azure Data Engineer Associate; Microsoft Certified: Azure Solutions Architect Expert
- Databricks Certified Data Engineer Associate
Responsibilities
- Design, build, and maintain reliable ETL/ELT pipelines to extract from diverse sources, transform and validate data, and load to enterprise storage/warehouse layers; optimize for scalability, performance, and cost.
- Integrate data from databases, APIs, and external systems; enforce consistency and integrity; contribute to dimensional and lakehouse modeling patterns that support BI/AI use cases.
- Leverage Azure Data Factory, Synapse, Databricks, and Spark to standardize ingestion/processing frameworks; automate jobs, monitoring, and alerting for resilient operations.
- Tune pipelines, queries, and clusters; address bottlenecks; apply caching, indexing/partitioning, and workload management for dependable SLAs.
- Implement in-pipeline data quality checks and validation rules; document lineage/assumptions; contribute to cataloging and stewardship practices in partnership with data governance.
- Partner with data analysts and scientists to productionize data for dashboards and models; translate business needs into technical designs and reusable data products.
- Evaluate emerging tools and methods (e.g., orchestration, streaming, cost/perf optimization); proactively enhance standards, templates, and developer experience.
Other
- Bachelor's degree in Computer Science, Data Science, Software Engineering, Information Systems, or a related quantitative field; Master's degree preferred.
- Minimum of 4 years in data engineering (or closely related), including hands-on pipeline development and operations.
- Excellent problem-solving/analytical skills and attention to detail; strong communication with both technical and business stakeholders; customer-service orientation and positive attitude.
- Independently delivers production-grade pipelines and data models; contributes to standards; begins leading small initiatives.
- Strong customer service focus, positive attitude, and excellent oral and written communication skills.