The business problem is to design, develop, and implement cloud-based analytics platforms (DAP) and data pipelines that acquire, cleanse, transform, and publish data from diverse sources, ensuring high standards of data quality, accuracy, and integrity.
Requirements
- Strong understanding of ETL processes to prepare data effectively for analysis.
- Proven experience in data integration across multiple sources (e.g., Databricks), ensuring consistency and integrity.
- Proficiency in Python, PySpark, and SQL.
- Cloud experience with Azure, GCP, or AWS.
- Prior experience in data engineering, data science, or related discipline.
- Familiarity with data architecture and data processing tools.
Responsibilities
- Design and assemble large, complex data sets that meet functional and non-functional business requirements.
- Partners with data asset managers, architects, and development leads to deliver data solutions aligned with architectural standards.
- Develop and enforce coding standards and best practices to ensure efficient and reusable services.
- Identify, design, and implement process improvements for scalability, reliability, and efficiency.
- Create and maintain clear documentation for processes, pipelines, and procedures.
- Provide troubleshooting and technical support, ensuring timely resolution of issues.
Other
- Demonstrating curiosity, continuous learning, and attention to detail.
- Close collaboration with technical teams and business stakeholders.
- Ownership of process documentation and reliable support when issues arise.
- Excellent communication skills and ability to collaborate across technical and business teams.
- Bachelor’s degree in computer science or a related field.