Design, build, and optimize data pipelines and solutions on AWS to power analytics and business decisions within Wealth Management.
Requirements
- Strong programming skills in Python and PySpark.
- Hands-on experience with AWS services including Glue, Lambda, Redshift, S3, and CloudWatch.
- Experience building ETL pipelines and transforming large-scale structured and unstructured data.
- Strong understanding of data warehousing concepts and performance tuning in Amazon Redshift.
- Proficiency in SQL for complex data queries and transformations.
- Familiarity with version control (Git) and CI/CD for data pipeline deployments.
- Experience with orchestration tools (e.g., Airflow, Step Functions) is a plus.
Responsibilities
- Design, develop, and maintain ETL/ELT pipelines using AWS Glue, PySpark, and Python.
- Implement serverless data workflows using AWS Lambda for automation and event-driven processing.
- Build and manage data models, tables, and schemas in Amazon Redshift, ensuring performance and scalability.
- Optimize data pipelines for reliability, performance, and cost-effectiveness.
- Collaborate with data scientists, analysts, and business teams to deliver clean, structured, and usable datasets.
- Ensure compliance with data security, governance, and best practices within AWS cloud infrastructure.
- Monitor, troubleshoot, and improve existing data solutions and workflows.
Other
- Problem-solving skills with the ability to work in fast-paced environments.
- Knowledge of data governance, cataloging, and lineage tools within AWS.
- Exposure to streaming frameworks (Kafka, Kinesis) is desirable.
- Prior experience in agile development environments.
- 3+ years of professional experience as a Data Engineer or in a similar role.