At Access, the business problem is to design, build, and maintain robust data pipelines that extract, transform, and load data from various sources into the data warehouse and analytical systems to support business intelligence, analytics, and decision-making.
Requirements
- Expert-level proficiency in Python for data processing, scripting, and automation (e.g., Pandas, NumPy, custom ETL scripts).
- Expert-level SQL skills, including advanced query writing, optimization, and experience with relational databases.
- Proven experience building and maintaining data pipelines; strong preference for hands-on experience with Apache Airflow (especially in managed environments like Astronomer).
- Hands-on database administration experience with PostgreSQL and Amazon Redshift (or similar data warehousing solutions).
- Strong understanding of data modeling, warehousing concepts, and best practices for data quality and governance.
- Experience with cloud platforms (e.g., AWS) and related services (e.g., S3).
- Knowledge of version control (Git) and CI/CD practices for data pipelines.
Responsibilities
- Design, develop, and optimize ETL/ELT processes and data pipelines to ingest, transform, and load data efficiently.
- Write complex SQL queries for data extraction, transformation, and validation.
- Implement and maintain data pipelines using scripting and orchestration tools.
- Perform database administration tasks, including performance tuning, schema design, backup/recovery, and security management.
- Collaborate with data analysts, engineers, and stakeholders to understand data requirements and ensure data quality and integrity.
- Monitor, troubleshoot, and optimize existing pipelines for performance and reliability.
- Document data processes, pipelines, and architectures.
Other
- Bachelor degree in a quantitative science; masters degree is a plus
- Excellent problem-solving skills and ability to work independently or in a team.
- Competitive salary and benefits package.
- Opportunity to work on impactful data projects in a collaborative environment.
- Professional growth and learning opportunities in a modern data stack.