Guidehouse is looking to solve data engineering and architecture problems in cloud-based data platforms
Requirements
- Strong scripting skills (Python, Bash)
- Experience with Delta Lake and Unity Catalog
- Strong knowledge of Spark architecture and distributed computing
- Hands-on experience with Terraform or other IaC tools
- Experience with Unity Catalog and Delta Lake
- Experience with data modeling and performance tuning
- Experience with streaming technologies (Kafka, Event Hub)
Responsibilities
- Develop and implement CI/CD pipelines for Databricks notebooks and jobs
- Develop ETL pipelines using PySpark and Databricks
- Implement Delta Lake for ACID transactions and data reliability
- Optimize ingestion from APIs, streaming, and batch sources
- Ensure compliance with data governance and security standards
- Collaborate with data engineers and scientists to support data pipelines and ML workflows
- Conduct ETL and data quality analysis using various technologies (i.e., Python, Databricks)
Other
- Bachelor’s degree is required
- Minimum SEVEN (7) years of total experience in cloud-based data platforms
- Minimum FIVE (5) years experience with Databricks
- Excellent problem-solving skills and attention to detail
- Strong communication and collaboration skills, with the ability to work effectively in a team environment